what does Microsoft Syntex do?
Microsoft Syntex is a combination of advanced AI and machine teaching models that automate content type identification and extract metadata from documents in addition to applying a sensitivity or retention label.
1. Identifies a Content Type
You teach Syntex to recognize what your document type looks like (and what it doesn’t look like). It would be difficult to teach Syntex to identify a broad content type like “project document.” For best results in Syntex, documents need to look similar or have similar phrases in them for the AI to differentiate them.
2. Extracts Metadata
Using either form processing or document understanding models (more on that in another article), you can train Syntex to find metadata in documents and extract it to SharePoint columns. So, if an Invoice always has a data label of “Invoice Number” then we can train the model to extract what comes after that label in the document. Think about how other labels like “Invoice #” or Invoice No” could complicate extracting that information. You would need to train the model to recognize all possible prefixes for the data you are trying to capture.
3. Applies Data Labels
Both sensitivity and retention labels can be applied automatically based on the content type identified.
Summary
Using AI that we train, Microsoft Syntex identifies the content type of the document, extracts metadata from that document, and applies data labels automatically. All you have to do is move documents into a library that has a Syntex model applied to it.
Scenarios where this could help:
Moving from paper documents to electronic records
Moving away from fileshares and into SharePoint Online
Migrating from on-prem to SharePoint Online
Extracting metadata from existing documents where a content type was not defined and metadata was never captured
Discovering and labeling sensitive content
Applying retention labels when moving records to a centralized repository