BOOKMARKS database and internet robot Here is a set of classes, libraries and programs I use to manipulate my bookmarks.html. I like Netscape Navigator, but I need more features, so I am writing these programs for my needs. I need to extend Navigator's "What's new" feature (Navigator 4 named it "Update bookmarks"). These programs are intended to run as follows. 1. bkmk2db converts bookmarks.html to bookmarks.db. 2. chk_urls (Internet robot) runs against bookmarks.db, checks every URL and saves results in check.db. 3. db2bkmk converts bookmarks.db back to bookmarks.html. Then I use this bookmarks file and... 4. bkmk2db converts bookmarks.html to bookmarks.db. 5. chk_urls (Internet robot) runs against bookmarks.db, checks every URL and saves results in check.db (old file copied to check.old). 6. (An yet unnamed program) will compare check.old with check.db and generate detailed report. For example: this URL is unchanged this URL is changed this URL is unavailable due to: host not found... Bookmarks database programs are almost debugged. What need to be done is support for aliases. Second version of the internet robot is finished. Although not required, these programs work fine with tty_pbar.py (my little module for creating text-mode progress bars). COPYRIGHT and LEGAL ISSUES All programs copyrighted by Oleg Broytmann and PhiloSoft Design. All sources protected by GNU GPL. Programs are provided "as-is", without any kind of warranty. All usual blah-blah-blah. #include ------------------------------ bkmk2db ------------------------------ NAME bkmk2db.py - script to convert bookmarks.html to FLAD database. SYNOPSIS bkmk2db.py [-its] [/path/to/bookmarks.html] DESCRIPTION bkmk2db.py splits given file (or ./bookmarks.html) into FLAD database bookmarks.db in current directory. Options: -i Inhibit progress bar. Default is to display progress bar if stderr.isatty() -t Convert to text file (for debugging). Default is to convert to FLAD. -s Suppress output of statistics at the end of the program. Default is to write how many lines the program read and how many URLs parsed. Also suppress some messages during run. BUGS The program starts working by writing lines to header file until BookmarksParser initializes its own output file (this occur when parser encountered 1st
tag). It is misdesign. Empty comments (no text after
) are not marked specially in database, so db2bkmk.py will not reconstruct it. I don't need empty
s, so I consider it as feature, not a real bug. Aliases are not supported (yet). ------------------------------ db2bkmk ------------------------------ NAME db2bkmk.py - script to reconstruct bookmarks.html back from FLAD database. SYNOPSIS db2bkmk.py [-is] [-t dict.db [-r]] DESCRIPTION db2bkmk.py reads bookmarks.db and creates two HTML files - public.html and private.html. The latter is just full bookmarks.html, while the former file hides private folder. Options: -i Inhibit progress bar. Default is to display progress bar if stderr.isatty() -s Suppress output of statistics at the end of the program. Default is to write how many records the program proceed and how many URLs created. Also suppress some messages during run. -t dict.db For most tasks, if someone need to process bookmarks.db in a regular way (for example, replace all "gopher://gopher." with "http://www."), it is easy to write special program, processing every DB record. For some tasks it is even simpler and faster to write sed/awk scripts. But there are cases when someone need to process bookmarks.db in a non-regular way: one URL must be changed in one way, another URL - in second way, etc. The -t option allows to use external dictionary for such translation. The dictionary itself is again FLAD database, where every record have two keys - URL1 and URL2. With -t option in effect, db2bkmk generates {private,public}.html, renames them to {private,public}.1, and then translates the entire bookmarks.db again, generating {private,public}.2 (totally 4 files), where every URL1 replaced with URL2 from dictionary. (See koi2win.db for example of translation dictionary) -r Reverse the effect of -t option - translate from URL2 to URL1. BUGS There are three hacks under line marked with "Dirty hacks here": 1. if record["Folder"] == "Private links": This is to hide passwords from my bookmarks file. 2. if record["Folder"] == "All the rest - Unclassified": outfile.write(" "*level + "