Hi folks,
I'm doing some html page scraping (html page to csv, as usual). So far, with a bunch of links,sed & grep pipes I've got the beast onto a fairly good state, but I'm stuck on the current step. The table data is nicely spaced out across the line. Now I want to replace all the "padding" spaces with a single csv delimiter leaving the "non-padding" spaces alone. By way of example, some lines of data in "infile.txt" might look like:
7:30pm McHales Navy (Repeat) Ensign Parker fouls up the supplies again.
8:00pm Green Acres (Repeat) Wilbur's plans for a new water supply cause havoc
8:30pm Movie: Rocky LXIV (Premier) Stalone breaks up a crime ring in a rest-home
In kate, replacing " /s+" with "|" works ...
|7:30pm|McHales Navy|(Repeat)|Ensign Parker fouls up the supplies again.
|8:00pm|Green Acres|(Repeat)|Wilbur's plans for a new water supply cause havoc
|8:30pm|Movie: Rocky LXIV|(Premier)|Stalone breaks up a crime ring in a rest-home
But this (sed 's/\s+/|/g' infile.txt), nor any other regexp I can google or think up works in sed, it just outputs the same as the input.
Can anyone help me out here?
tia
ted
update: Nevermind, and sorry folks, I found it : sed needs the -E option for it to work. Normal transmission will now be resumed .... (thanks anyway)