View Full Version : Find delimiter
Stephen G.
11-01-2008, 10:52 AM
Is there a way to read a text file and identify the delimiter without the need to specify the delimiter type in the code or by the user?
Example...
The first text file reads:
John Doe;;123 Riverside Drive;;Beverly Hills;;CA;;90210;;etc...
The second text file reads:
John Doe|123 Riverside Drive|Beverly Hills|CA|90210|etc...
the output should find the delimiter type used in each text file
"The delimiter found is ;;"
"The delimiter found is |"
I tried using string.find and searching for different types of patterns but it seemed be code overkill.
Thanks
Imagine Programming
11-01-2008, 10:56 AM
What if you just start the file with ||John Doe or ;;John Doe so you can determine it like this:
sTextfile = TextFile.ReadToString("textfile.txt");
sDelimiter = String.Mid(sTextfile, 1, 2);
Stephen G.
11-01-2008, 11:18 AM
The transactions are prepared by hundreds of different users, each using there own accounting software, and using various delimiters.
I must gather the info from each file (thousands) in an attempt to organize the data.
The delimiter is usually shown after the first entry like this:
Credit card|$34.99 |$43.24 |$3.30 |$4.95
In some cases, the delimiter is random like, "---", making it difficult to search commonly used delimiters.
RizlaUK
11-01-2008, 11:51 AM
How i would do it,
function GetDelimiter(sDelimitedString, tbDelimiters)
local sRet=""
for index, sDelimiter in tbDelimiters do
if String.ReverseFind(sDelimitedString, sDelimiter, false) ~= -1 then
sRet=sDelimiter;
break;
end
end
return sRet
end
-- Test
Dialog.Message("Test", "The delimiter for this string is '"..GetDelimiter("Item1|Item2|Item3|Item4|Item", {"|",",",";","/"}).."'");
Dialog.Message("Test", "The delimiter for this string is '"..GetDelimiter("Item1,Item2,Item3,Item4,Item", {"|",",",";","/"}).."'");
Dialog.Message("Test", "The delimiter for this string is '"..GetDelimiter("Item1;Item2;Item3;Item4;Item", {"|",",",";","/"}).."'");
Dialog.Message("Test", "The delimiter for this string is '"..GetDelimiter("Item1/Item2/Item3/Item4/Item", {"|",",",";","/"}).."'");
In some cases, the delimiter is random like, "---", making it difficult to search commonly used delimiters.
the only thing i can suggest for that is as each new delimiter is encounterd add it to the tbDelimiters table.
presidente
11-01-2008, 12:26 PM
I you know that the symbol $ is given in front of the amount, you can parse for the symols between number and the $ symbol.
Normaly there should be a standard how the data in the accounting software has to be formed.
Stephen G.
11-01-2008, 01:14 PM
This works very well R, thank you. I still have to specify the delimiter, but I don't think there is any way around that. I added a few extra delimiters I have encountered so far.
-- Locate the data file
Data_file = Dialog.FileBrowse(true, "Locate File", _DesktopFolder, "All Files (*.txt)|*.txt|", "", "", false, false);
Data_file = TextFile.ReadToString(Data_file[1]);
function GetDelimiter(sDelimitedString, tbDelimiters)
local sRet=""
for index, sDelimiter in tbDelimiters do
if String.ReverseFind(sDelimitedString, sDelimiter, false) ~= -1 then
sRet=sDelimiter;
break;
end
end
return sRet
end
sTab = String.Char(9);
sData = GetDelimiter(Data_file, {"|",",",";","/",sTab,":",";;","::","||","---","--"});
-- convert to a comma delimited data file
output = String.Replace(Data_file, sData, ",", true);
Dialog.Message("test", output, MB_OK, MB_ICONINFORMATION, MB_DEFBUTTON1);
Stephen G.
11-01-2008, 01:20 PM
This is a reply to P. The entries do not all have the "$" symbol in front.
I believe R's code works well, so long as the delimiter is not uncommon.
Thanx
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.