I have the a text file as follows:
"0","Column","column2","Column3"
I have managed to get the data down to split to the following:
"0"
"Column"
"Column2"
"Column3"
with ,(?=(?:[^']*'[^']*')*[^']*$)
, now I want to remove the quotes. I have tested the expression [^\s"']+|"([^"]*)"|\'([^\']*)
an online regex tester, which gives the correct output im looking for. However, I am getting a syntax error when using the expression:
String[] columns = Regex.Split(dataLine, "[^\s"']+|"([^"]*)"|\'([^\']*)");
Syntax error ',' expected
I've tried escaping characters but to no avail, am I missing something?
Any help would be greatly appreciated!
Thanks.
Answer
The problems are the double quotes inside the regex, the compiler chokes on them, think they are the end of string.
You must escape them, like this:
"[^\s\"']+|\"([^\"]*)\"|\'([^\']*)"
Edit
:
You can actually do all, that you want with one regex, without first splitting:
@"(?<=[""])[^,]*?(?=[""])"
Here I use an @ quoted string where double quotes are doubled instead of escaped.
The regex uses look behind to look for a double quote, then matching any character except comma ',' zero ore more times, then looks ahead for a double quote.
How to use:
string test = @"""0"",""Column"",""column2"",""Column3""";
Regex regex = new Regex(@"(?<=[""])[^,]*?(?=[""])");
foreach (Match match in regex.Matches(test))
{
Console.WriteLine(match.Value);
}
No comments:
Post a Comment