Safari Books Online is a digital library providing on-demand subscription access to thousands of learning resources.
A coworker and I were working on a project where we needed to be able to parse a large (38,000-line) legacy SQL file. To make the parsing job easier, we wanted to break the monolithic file into smaller chunks of about 1,000 lines each. We thought very briefly about doing it by hand, but decided that automating it would be better. We thought about trying to do this with sed, but it looked like it would be complicated. We eventually settled on Ruby, and about an hour later, we had this:
SQL_FILE = "./GeneratedTestData.sql"
OUTPUT_PATH = "./chunks of sql/"
line_num = 1
file_num = 0
Dir.mkdir(OUTPUT_PATH) unless
File.exists? OUTPUT_PATH
file = File.new(OUTPUT_PATH + "chunk " + file_num.to_s + ".sql",
File::CREAT|File::TRUNC|File::RDWR, 0644)
done, seen_1k_lines = false
IO.readlines(SQL_FILE).each do |line|
file.puts(line)
seen_1k_lines = (line_num % 1000 == 0) unless seen_1k_lines
line_num += 1
done = (line.downcase =~ /^\W*go\W*$/
or
line.downcase =~ /^\W*end\W*$/) != nil
if done and seen_1k_lines
file_num += 1
file = File.new(OUTPUT_PATH + "chunk " + file_num.to_s + ".sql",
File::CREAT|File::TRUNC|File::RDWR, 0644)
done, seen_1k_lines = false
end
end
This little Ruby program reads lines from the original source file until it has read 1,000 lines. Then, it starts looking for lines that have either GO or END on them. Once it finds either of those two strings, it finishes off the current file and starts another one.
We calculated that it probably would have taken us about 10 minutes to break this file up via brute force, and it took about an hour to automate it. We eventually had to do it five more times, so we almost reclaimed the time we spent automating it. But that’s not the important point. Performing simple, repetitive tasks by hand makes you dumber, and it steals part of your concentration, which is your most productive asset.
Note:
Performing simple, repetitive tasks squanders your concentration.
Figuring out a clever way to automate the task makes you smarter because you learn something along the way. One of the reasons it took us so long to complete this Ruby program was our unfamiliarity with how Ruby handled low-level file manipulation. Now we know, and we can apply that knowledge to other projects. And, we’ve figured out how to automate part of our project infrastructure, making it more likely that we’ll find other ways to automate simple tasks.
Note:
Finding innovative solutions to problems makes it easier to solve similar problems in the future.