Parse Custom Delimiters

April 8, 2015

Report Abuse
This experiment shows importing and transforming a text dataset with a custom delimiter.
AzureML currently does not support parsing of text data with custom delimiters. The workaround is therefore to first import the dataset as a tab-separated file which creates a table with a single column of text. Then, you can use Execute Python Script to tokenize the text using the custom delimiter and create a dataset with the correct columns. Often, data parsed in this manner is heterogeneous (i.e., contain a mix of types). Therefore, we use the Metadata Editor module to coerce columns to the appropriate type. Thanks to AzureML user Taheer-Naveed for posting this question in the forum (