X-Git-Url: https://git.phdru.name/?a=blobdiff_plain;f=doc%2FTODO;h=a74684b467cee497e87c307fb42f3bbd859812b5;hb=953ab6c88f08a4670304641966841a45739d8b7a;hp=6b4e748da00e3372adb33feee13066b42ebf24d3;hpb=9e35a705cfd7aa7640069ed805919a084ba2405c;p=bookmarks_db.git diff --git a/doc/TODO b/doc/TODO index 6b4e748..a74684b 100644 --- a/doc/TODO +++ b/doc/TODO @@ -1,4 +1,4 @@ - Cleanup HTML using BeautifulSoap or Tidy. + Cleanup HTML before parsing using BeautifulSoap or Tidy. Parse downloaded file and get javascript redirects. More and better documentation. @@ -7,7 +7,7 @@ New storage managers: shelve, SQL, ZODB, MetaKit. More robots (URL checkers): threading, asyncore-based. - Configuration file for configuring defaults - global defaults for the system + Configuration file to configure defaults - global defaults for the system and local defaults for subsystems. Ruleset-based mechanisms to filter out what types of URLs to check: checking