For example, consider the following html file :
With all the tags removed, the output would be :
Hello, there
I wrote this method with pen and paper. Yes, I was actually "WRITING" program with pen, not "TYPING" program in front of a computer. Here is the version I wrote :
Drawback
The above program read the input file line by line, remove the tags for each line, and then append the line to the output file.There is a major drawback of the above implementation. Since the removeTag() method is applied on a line by line basis, it doesn't work if the open tag and the close tag are on separated lines, such as :
Alternative : Reading the whole file into a content buffer
When I finally arrived home and got access to a computer, I rewrote the method.This method consumes more memory. However, it is a completely feasible method. Most html files are not very big. You won't find a html with a file size of 100 megabytes.
No comments:
Post a Comment