PDA

View Full Version : Can TextFile.ReadToTable be speeded up?



IdeasVacuum
10-09-2009, 06:32 PM
I need to read 6MB text files to tables (around 20,000 short lines of text). An update of the progress bar would be good but I do not understand why the read takes a month of Sundays in the first place - the files load instantly into plain text editors for example.

Has anyone hit this problem and found a solution?

Also, neither ReadToTable or ReadToString handle Unix/Mac text files correctly, only the first line is read. Anybody know a way around that gotcha programmatically?

jassing
10-09-2009, 11:33 PM
Can TextFile.ReadToTable be speeded up?

no.

Consider writing your routines in C and make a dll or exe to do the processing.

Remember, you're using a SCRIPTING language, it's not compiled.... so you can't expect pure speed out of it.

Sakuya
10-10-2009, 03:43 AM
I need to read 6MB text files to tables (around 20,000 short lines of text). An update of the progress bar would be good but I do not understand why the read takes a month of Sundays in the first place - the files load instantly into plain text editors for example.

Has anyone hit this problem and found a solution?

Also, neither ReadToTable or ReadToString handle Unix/Mac text files correctly, only the first line is read. Anybody know a way around that gotcha programmatically?

Have you tried using the Lua Input/Output model?

This way you'll be able to give progress while reading line by line.

Pseudo-code, probably won't work.


local myfile = io.open("C:\\myfile.txt", "rb");
local contents = {};

while (myfile) do
table.insert(contents, myfile:read("*line"));
end

IdeasVacuum
10-10-2009, 12:55 PM
I have not seen that function before, thanks Sakuya.

IdeasVacuum
10-10-2009, 01:07 PM
Hello jassing

Of course I appreciate that scripting is much slower than compiled code, but in this case it seems to be slower than one might reasonably expect - perhaps vbs would win that race.

Anyway, I have bitten the bullet and coded the same task in PureBasic (Demo), which processes the entire job (6MB in, 6MB out) in less than 2 seconds (1 hour+ in AMS). Now that PB performance surprised me, it is much faster than a similar project I wrote in C++ (pure code, no .Net).