Wednesday, December 05, 2007

wget

I'm sick of the GUI's, that is GROSS, UNUSABLE want to be INTERFACEs. i'm going retro, 10 years back and going to do console. we lost so much power when we told people it worked better if it had a picture on it. get your cross compiler running and load up the real tools with 10-30 years in the making. try the following

grep: Search for PATTERN in each FILE or standard input.
gawk: utility interprets a special-purpose programming language that makes it possible to handle simple data-reformatting jobs easily with just a few lines of code.
sed: is a stream editor. A stream editor is used to perform basic text transformations on an input stream
wget : GNU Wget is a free utility for non-interactive download of files from the Web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies.

you'll be a much happier text processing nerd. well, ok, you can get some of them prebuilt here, and then read a 10sec how to here. on the output of your script toss it into gnuplot and you'll be able to do with your pinky and 10 lines of code what the new generation of engineer thought impossible. now to figure out how to control a web page form from the console and I'll be able to interact with the web from a script and be done! maybe i need this form of python

kev i know you had some web scraping experience in the past, what tools did you use?

4 comments:

Anonymous said...

Wget is awesome, i love the cli, ROCK ON!

forkev said...

i just checked. I have one small app I wrote with wget.exe at the core that is called by windows scheduler once a day. wget retrieves a burried webbage from one of my apps, and that page does housecleaning of a website when called. a REALLY simple use for wget, but flawlesss for 2 years and counting.

k2h said...

I read about parsing Shuttle Radar Topographic Mission data with awk and putting it in MySQL and then exporting with awk into kmz for google earth integration. genius! see techniques here

Cefn Hoile used the information for finding suitable launch sites for paragliding

k2h said...

best awk tutorial i've found yet