]>
git.phdru.name Git - bookmarks_db.git/log
Oleg Broytman [Mon, 4 Mar 2024 15:13:13 +0000 (18:13 +0300)]
Fix(Robots/bkmk_rrequests): Add forgotten spaces in log
Oleg Broytman [Mon, 4 Mar 2024 10:48:26 +0000 (13:48 +0300)]
Fix(Robots/bkmk_rrequests): No need to re-check error 404 via proxy
Oleg Broytman [Sun, 3 Mar 2024 20:49:55 +0000 (23:49 +0300)]
Version 5.2.3
Feat(Robots/bkmk_rrequests): Report 40x and 50x errors.
Fix HTML pasrer based on Bs4: Find "shortcut icon".
Oleg Broytman [Sun, 3 Mar 2024 20:41:54 +0000 (23:41 +0300)]
Feat(Robots/bkmk_rrequests): Report 40x and 50x errors
Oleg Broytman [Sun, 3 Mar 2024 20:31:44 +0000 (23:31 +0300)]
Feat(Robots/bkmk_rrequests): Change error message
Oleg Broytman [Sun, 3 Mar 2024 14:47:58 +0000 (17:47 +0300)]
Fix(parse_html/bkmk_ph_beautifulsoup4): Find "shortcut icon"
Bs4 splits attribute values. To fix it the value must be re-combined back.
Oleg Broytman [Sun, 3 Mar 2024 14:24:52 +0000 (17:24 +0300)]
Fix(Robots/bkmk_robot_base): Add forgotten spaces in log
Oleg Broytman [Sun, 3 Mar 2024 10:29:06 +0000 (13:29 +0300)]
Version 5.2.2
Robots/bkmk_rrequests: Add request headers.
Robots/bkmk_robot_base: Process "data:image/" icons.
Oleg Broytman [Sun, 3 Mar 2024 10:22:48 +0000 (13:22 +0300)]
Feat(Robots/bkmk_robot_base): Process "data:image/" icons
Oleg Broytman [Sun, 3 Mar 2024 10:10:13 +0000 (13:10 +0300)]
Feat(Robots/bkmk_rrequests): Add request headers
Oleg Broytman [Sun, 3 Mar 2024 09:48:11 +0000 (12:48 +0300)]
Refactor(Robots): Refactor request headers
Oleg Broytman [Sun, 3 Mar 2024 09:47:40 +0000 (12:47 +0300)]
Style(Robots/bkmk_rurllib_py3): Remove unused variable
Oleg Broytman [Sat, 2 Mar 2024 13:28:46 +0000 (16:28 +0300)]
Fix(Robots/bkmk_robot_base): Ignore unknown charset
There are sites that provide incorrect
(most probably misspelled) charset.
Oleg Broytman [Sat, 2 Mar 2024 09:28:34 +0000 (12:28 +0300)]
Fix(Robots/bkmk_robot_base): Add forgotten space in log
Oleg Broytman [Sat, 2 Mar 2024 09:13:42 +0000 (12:13 +0300)]
Perf(Rebobt/requests): Speedup second access
Use proxy immediately for hosts
for which we already know they require proxy.
Don't use proxy for hosts that aren't accessible even through proxy,
immediately return an error.
Oleg Broytman [Fri, 1 Mar 2024 21:02:57 +0000 (00:02 +0300)]
Refactor(Rebobt/requests)
Oleg Broytman [Fri, 1 Mar 2024 20:57:56 +0000 (23:57 +0300)]
Feat: For the robot based on requests allow to use a proxy
Oleg Broytman [Wed, 28 Feb 2024 21:18:38 +0000 (00:18 +0300)]
Feat: Robot based on requests
Oleg Broytman [Wed, 28 Feb 2024 19:03:42 +0000 (22:03 +0300)]
Feat(venv): Use `venv` if `virtualenv` is not available
Oleg Broytman [Tue, 28 Nov 2023 17:04:18 +0000 (20:04 +0300)]
Fix(Py3): Use `urllib.parse.urlsplit()`
Oleg Broytman [Wed, 22 Nov 2023 16:09:56 +0000 (19:09 +0300)]
Release 5.0.0
Oleg Broytman [Wed, 22 Nov 2023 16:09:45 +0000 (19:09 +0300)]
Docs: Update
Oleg Broytman [Tue, 21 Nov 2023 18:47:34 +0000 (21:47 +0300)]
Fix(Py3): Open list of titles in UTF-8
Oleg Broytman [Tue, 21 Nov 2023 18:46:42 +0000 (21:46 +0300)]
Fix(Py3): Always open text storage files in UTF-8
Oleg Broytman [Mon, 20 Nov 2023 20:58:14 +0000 (23:58 +0300)]
Fix(Py3): Always log in UTF-8
Oleg Broytman [Mon, 20 Nov 2023 17:49:22 +0000 (20:49 +0300)]
Fix(Py3): `html.parser` cannot parse bytes
Decode to unicode from a known encoding.
Oleg Broytman [Mon, 20 Nov 2023 17:34:36 +0000 (20:34 +0300)]
Fix(Py3): Decode content using HTTP chrset
Oleg Broytman [Mon, 20 Nov 2023 17:33:42 +0000 (20:33 +0300)]
Fix(Py3): `urllib` writes its files as bytes
Oleg Broytman [Mon, 20 Nov 2023 16:21:22 +0000 (19:21 +0300)]
Fix(parse_html CLI): Report encodings and the title
Oleg Broytman [Mon, 20 Nov 2023 16:20:31 +0000 (19:20 +0300)]
Fix(parse_html/bkmk_parse_html.py): Open the file with known encoding
Oleg Broytman [Mon, 20 Nov 2023 01:12:54 +0000 (04:12 +0300)]
Fix(parse_html/bkmk_ph_beautifulsoup4): Fix title recombination
Oleg Broytman [Mon, 20 Nov 2023 01:02:30 +0000 (04:02 +0300)]
Fix(Py3): Remove forgotten `.decode()`/`.encode()`
Oleg Broytman [Mon, 20 Nov 2023 00:50:26 +0000 (03:50 +0300)]
Feat: Remove some HTML parsers
EtreeTidy is outdated and buggy.
html5 is outdated.
Oleg Broytman [Mon, 20 Nov 2023 00:39:46 +0000 (03:39 +0300)]
Style: Fix `flake8` E501 line too long
Oleg Broytman [Mon, 20 Nov 2023 00:38:00 +0000 (03:38 +0300)]
Style: Fix `flake8` E402 module level import not at top of file
Oleg Broytman [Mon, 20 Nov 2023 00:16:46 +0000 (03:16 +0300)]
Chore(venv): Only run `pip install` on fresh virtual env
Oleg Broytman [Mon, 20 Nov 2023 00:00:06 +0000 (03:00 +0300)]
Feat(check_url.py): Print "Moved", "Size", "Md5"
Oleg Broytman [Sun, 19 Nov 2023 23:58:53 +0000 (02:58 +0300)]
Fix(robots): Fix "Content-Length" header returning `None`
Oleg Broytman [Sat, 18 Nov 2023 16:47:22 +0000 (19:47 +0300)]
Fix(robots): Store charset
Oleg Broytman [Fri, 17 Nov 2023 23:55:10 +0000 (02:55 +0300)]
Fix(robots): Do not parse empty strings
Some sites return empty "html" that consist only of white spaces.
Strip them to get really empty string.
Oleg Broytman [Fri, 17 Nov 2023 23:54:46 +0000 (02:54 +0300)]
Fix(parse_html): Do not parse empty strings
Oleg Broytman [Fri, 17 Nov 2023 22:32:53 +0000 (01:32 +0300)]
Fix(Py3): Reconfigure logs to write in UTF-8
Oleg Broytman [Fri, 17 Nov 2023 21:48:40 +0000 (00:48 +0300)]
Build(Makefile): Update the list of example shell scripts
Oleg Broytman [Thu, 16 Nov 2023 07:27:26 +0000 (10:27 +0300)]
Feat: Delete bookmarks
Oleg Broytman [Thu, 16 Nov 2023 05:35:41 +0000 (08:35 +0300)]
Feat(robots): Align "Content-Type"
Oleg Broytman [Thu, 16 Nov 2023 05:33:45 +0000 (08:33 +0300)]
Fix(parse_html): Do not parse empty strings
Oleg Broytman [Thu, 16 Nov 2023 05:26:52 +0000 (08:26 +0300)]
Fix(Py3): Fix `unescape`
Oleg Broytman [Wed, 15 Nov 2023 21:28:08 +0000 (00:28 +0300)]
Fix(Py3): Fix `check_url.py`
Oleg Broytman [Wed, 15 Nov 2023 18:12:15 +0000 (21:12 +0300)]
Build: Make Python virtual environment
Install libraries.
Oleg Broytman [Wed, 15 Nov 2023 16:58:36 +0000 (19:58 +0300)]
Fix(Py3): Fix HTML parsers
Oleg Broytman [Tue, 14 Nov 2023 23:27:46 +0000 (02:27 +0300)]
Feat(robots): Handle HTTP redirect 308
Oleg Broytman [Tue, 14 Nov 2023 18:01:59 +0000 (21:01 +0300)]
Feat: Improve stats
Oleg Broytman [Tue, 14 Nov 2023 17:56:26 +0000 (20:56 +0300)]
Feat: Open log files in UTF-8 encoding
Oleg Broytman [Tue, 14 Nov 2023 17:53:50 +0000 (20:53 +0300)]
Feat: Log reports to files
Oleg Broytman [Tue, 14 Nov 2023 16:56:41 +0000 (19:56 +0300)]
Docs(TODO): Increase priority for robots
Oleg Broytman [Tue, 14 Nov 2023 15:11:12 +0000 (18:11 +0300)]
Feat: Report redirects and set URLs
Run through the bookmarks database and set URLs from redirects
from an external file.
Oleg Broytman [Mon, 13 Nov 2023 22:21:45 +0000 (01:21 +0300)]
Fix(Py3): Catch `http.client.IncompleteRead`
Oleg Broytman [Mon, 13 Nov 2023 15:13:14 +0000 (18:13 +0300)]
Fix(Py3): Guess input file encoding
Oleg Broytman [Mon, 13 Nov 2023 14:39:17 +0000 (17:39 +0300)]
Chore: Explicitly open text files in text mode
Oleg Broytman [Mon, 13 Nov 2023 14:36:01 +0000 (17:36 +0300)]
Fix(Py3): Open output text files in utf-8 encoding
Oleg Broytman [Sun, 12 Nov 2023 19:10:20 +0000 (22:10 +0300)]
Docs: Update
Oleg Broytman [Sun, 12 Nov 2023 18:56:15 +0000 (21:56 +0300)]
Fix(robots): Process redirect with non-encoded URL
Oleg Broytman [Sun, 12 Nov 2023 18:19:58 +0000 (21:19 +0300)]
Fix(robots): Process response without `Content-Type`
Try to recognize HTML.
Oleg Broytman [Sun, 12 Nov 2023 18:11:22 +0000 (21:11 +0300)]
Fix(Py3): Fix log reporting
`error` could be bytes.
Oleg Broytman [Sun, 12 Nov 2023 16:11:20 +0000 (19:11 +0300)]
Fix(Py3): Fix subrocess: pass bytes streams to `RecordFile`
Oleg Broytman [Sun, 12 Nov 2023 14:36:14 +0000 (17:36 +0300)]
Fix(Py3): Subrocess must use `urllib`
`urllib2` robot doesn't work in Python 3.
Oleg Broytman [Sun, 12 Nov 2023 14:23:26 +0000 (17:23 +0300)]
Fix(Py3): Fix `subproc.py`
Work with bytes.
Oleg Broytman [Sun, 12 Nov 2023 13:57:51 +0000 (16:57 +0300)]
Fix(Py3): Fix absolute import
Oleg Broytman [Sun, 12 Nov 2023 13:49:28 +0000 (16:49 +0300)]
Fix(Py3): Some socket errors are reported as `OSError`
Oleg Broytman [Sun, 12 Nov 2023 13:35:38 +0000 (16:35 +0300)]
Fix(Py3): Encode unicode to bytes
Oleg Broytman [Sun, 12 Nov 2023 11:46:38 +0000 (14:46 +0300)]
Fix(Py3): Work around an old bug in `urlopen`
It passes an extra parameter `timeout`
which `URLopener.open()` doesn't accept.
Oleg Broytman [Sun, 12 Nov 2023 11:24:49 +0000 (14:24 +0300)]
Fix(Robots/bkmk_rurllib_py3.py): Restore opener
`urllib.request.urlcleanup()` clears opener.
Oleg Broytman [Sun, 12 Nov 2023 11:24:18 +0000 (14:24 +0300)]
Fix(Storage/bkmk_stflad.py): Fix reading header
Oleg Broytman [Sun, 12 Nov 2023 10:38:34 +0000 (13:38 +0300)]
Build(Makefile): The next version will be a new major release
Oleg Broytman [Sun, 12 Nov 2023 10:38:08 +0000 (13:38 +0300)]
Docs(README): Fix copyright year
Oleg Broytman [Sun, 12 Nov 2023 10:04:22 +0000 (13:04 +0300)]
Fix(Py3): Fix `list.join(separator)`
It's now `separator.join(list)`.
Oleg Broytman [Sun, 12 Nov 2023 10:01:29 +0000 (13:01 +0300)]
Fix(Py3): Fix `urllib`-based robot
Oleg Broytman [Sat, 11 Nov 2023 20:10:39 +0000 (23:10 +0300)]
Fix(Py3): Fix `htmlentities` import
Oleg Broytman [Sat, 11 Nov 2023 20:08:17 +0000 (23:08 +0300)]
Fix(Py3): Fix `urljoin` import
Oleg Broytman [Sat, 11 Nov 2023 18:35:26 +0000 (21:35 +0300)]
Fix(Py3): Stop encoding unicode to bytes
Oleg Broytman [Sat, 11 Nov 2023 18:34:51 +0000 (21:34 +0300)]
Fix(Py3): Stop using module `string`
Oleg Broytman [Sat, 11 Nov 2023 18:33:45 +0000 (21:33 +0300)]
Fix(Py3): Open files in text mode
Oleg Broytman [Fri, 10 Nov 2023 14:07:27 +0000 (17:07 +0300)]
Fix(Py3): Fix `.has_key()`
Oleg Broytman [Wed, 1 Nov 2023 15:32:01 +0000 (18:32 +0300)]
Fix(Py3): Fix import from `urllib`
Oleg Broytman [Tue, 31 Oct 2023 19:03:27 +0000 (22:03 +0300)]
Fix(Storage/bkmk_stjson.py): open file in text mode
Oleg Broytman [Tue, 31 Oct 2023 19:03:00 +0000 (22:03 +0300)]
Fix(Py3): `exec` in a local namespce
Oleg Broytman [Tue, 31 Oct 2023 16:45:06 +0000 (19:45 +0300)]
Feat: Set shebang to `python3`
Oleg Broytman [Tue, 31 Oct 2023 16:31:11 +0000 (19:31 +0300)]
Fix(parse_html): Fix import
Oleg Broytman [Thu, 28 Sep 2023 12:34:40 +0000 (15:34 +0300)]
Style: Fix flake8 W605 invalid escape sequence
Oleg Broytman [Thu, 28 Sep 2023 12:21:50 +0000 (15:21 +0300)]
Style: Fix flake8 F841 local variable is assigned to but never used
Oleg Broytman [Sat, 16 Sep 2023 19:40:15 +0000 (22:40 +0300)]
Fix(Py3): Fix `cmp` compatibility
Oleg Broytman [Sat, 16 Sep 2023 19:39:13 +0000 (22:39 +0300)]
Style: Fix flake8 F821 undefined name 'unichr'
Oleg Broytman [Sat, 16 Sep 2023 19:26:28 +0000 (22:26 +0300)]
Fix(Py3): Fix `basestring` compatibility
Oleg Broytman [Sat, 16 Sep 2023 19:23:09 +0000 (22:23 +0300)]
Fix(parse_html/bkmk_parse_html): Fix imports
Oleg Broytman [Sat, 16 Sep 2023 19:20:51 +0000 (22:20 +0300)]
Fix(Py3): Fix `unicode` compatibility
Oleg Broytman [Mon, 11 Sep 2023 14:06:44 +0000 (17:06 +0300)]
Fix(Py3): Replace `unicode()` with `.decode()`
Oleg Broytman [Mon, 11 Sep 2023 13:52:04 +0000 (16:52 +0300)]
Style: Fix flake8 F401 module imported but unused
Oleg Broytman [Mon, 11 Sep 2023 13:49:25 +0000 (16:49 +0300)]
Style: Fix flake8 E999 IndentationError: unexpected indent
Oleg Broytman [Wed, 6 Sep 2023 20:45:07 +0000 (23:45 +0300)]
Style: Fix flake8 E741 ambiguous variable name 'l'
Oleg Broytman [Wed, 6 Sep 2023 20:45:27 +0000 (23:45 +0300)]
Style: Completely ignore some flake8 warnings
E701 multiple statements on one line (colon)
E722 do not use bare 'except'