]> git.phdru.name Git - bookmarks_db.git/history - Robots/parse_html_beautifulsoup.py
Fixed a bug - break out of the loop after finding the first working charset.
[bookmarks_db.git] / Robots / parse_html_beautifulsoup.py
2008-02-13 Oleg BroytmanReplace BeautifulSoup's guessed cp1252 with DEFAULT_CHA...
2008-01-09 Oleg BroytmanDo the second check for title only if there is HEAD.
2008-01-08 Oleg BroytmanSome sites put TITLE in HTML outside of HEAD.
2008-01-08 Oleg BroytmanSome sites put TITLE in HTML without HEAD.
2008-01-08 Oleg BroytmanDo not return an empty string - pass it to BSoupParser.
2008-01-08 Oleg BroytmanIf there is HEAD but no TITLE - return empty title.
2007-12-22 Oleg BroytmanAlways close the input file.
2007-12-22 Oleg BroytmanAdded BadDeclParser.
2007-12-18 Oleg BroytmanTry BeautifulSoup; if it fails - fall back to HTML...
2007-12-18 Oleg BroytmanFixed a bug = meta_charset is True if HTTP charset...
2007-12-16 Oleg BroytmanDirect return.
2007-12-16 Oleg BroytmanCalculate if the charset came from HTTP or from HTML...
2007-12-16 Oleg BroytmanInherit HTMLParser (for unescape).
2007-12-16 Oleg BroytmanAdded parser for html based on BeautifulSoup.