]>
git.phdru.name Git - bookmarks_db.git/log
Oleg Broytman [Sun, 8 Sep 2024 07:36:12 +0000 (10:36 +0300)]
Feat(Robots): Combined curl with curlmulti
The combined robot is named just curl.
Oleg Broytman [Sat, 7 Sep 2024 14:28:54 +0000 (17:28 +0300)]
Feat(Robots): Robot based on curl_multi
Processes multiple URLs in parallel using concurrent.futures (multithreaded).
Oleg Broytman [Sat, 7 Sep 2024 11:39:21 +0000 (14:39 +0300)]
Feat(aio): Combine `aiohttp` with `multiaio`
The combined robot is named just `aio`.
Oleg Broytman [Thu, 5 Sep 2024 14:32:55 +0000 (17:32 +0300)]
Refactor(Robots): Split `bkmk_rmultirequests` into `concurrent_futures` mix-in
Oleg Broytman [Thu, 5 Sep 2024 14:03:04 +0000 (17:03 +0300)]
Feat(bkmk_rmultirequests): Fix `concurrent_class` to `ProcessPoolExecutor`
Oleg Broytman [Wed, 4 Sep 2024 18:03:31 +0000 (21:03 +0300)]
Update docs
Oleg Broytman [Thu, 22 Aug 2024 22:22:39 +0000 (01:22 +0300)]
Fix(base): Fix reporting proxy error
Oleg Broytman [Thu, 22 Aug 2024 21:46:45 +0000 (00:46 +0300)]
Feat(Robots): Simplify `.get()`
Return data without processing,
let's `get_url()` process it in one place.
Oleg Broytman [Thu, 22 Aug 2024 16:54:36 +0000 (19:54 +0300)]
Refactor(bkmk_rcurl): Split off `CurlWrapper`
Will be used in multi-curl robot(s).
Oleg Broytman [Wed, 21 Aug 2024 14:32:25 +0000 (17:32 +0300)]
Refactor(Robots): Separate `headers` into `req_headers` and `resp_headers`
Oleg Broytman [Wed, 21 Aug 2024 14:04:46 +0000 (17:04 +0300)]
Refactor(bkmk_rmultiaio): Split off `multi_async_mixin`
Oleg Broytman [Tue, 20 Aug 2024 22:30:28 +0000 (01:30 +0300)]
Feat(bkmk_rcurl): Lower SSL security settings
Oleg Broytman [Tue, 20 Aug 2024 22:21:26 +0000 (01:21 +0300)]
Refactor(Robots): Pass headers instead of charset
Oleg Broytman [Tue, 20 Aug 2024 17:28:15 +0000 (20:28 +0300)]
Chore(Robots): Report "Checked: <URL>" but avoid duplicates
Robots that process multiple URLs in parallel report it themselves.
Oleg Broytman [Tue, 20 Aug 2024 17:27:08 +0000 (20:27 +0300)]
Chore(check_urls): Improve output
Oleg Broytman [Mon, 19 Aug 2024 19:28:41 +0000 (22:28 +0300)]
Refactor(Robots): `get`/`get_url` don't need `bookmark`, only charset
Oleg Broytman [Mon, 19 Aug 2024 16:26:29 +0000 (19:26 +0300)]
Fix(get_url): Adapt `get_url` to the new shiny async world
Oleg Broytman [Mon, 19 Aug 2024 14:51:20 +0000 (17:51 +0300)]
Refactor(bkmk_rmultirequests): Change parent to `robot_base`
`bkmk_rmultirequests` instantiates `bkmk_rrequests` in the workers
but itself doesn't really use anything from `bkmk_rrequests`,
so it can be just a base robot.
Oleg Broytman [Mon, 19 Aug 2024 13:30:59 +0000 (16:30 +0300)]
Docs: Fix forgotten docs
Oleg Broytman [Mon, 19 Aug 2024 12:43:54 +0000 (15:43 +0300)]
Feat(bkmk_raiohttp): Lower SSL cert validation strictness
Oleg Broytman [Sat, 7 Sep 2024 10:58:16 +0000 (13:58 +0300)]
Fix(robots.py): Fix robot name
Oleg Broytman [Mon, 19 Aug 2024 00:59:24 +0000 (03:59 +0300)]
Feat(Robots): Robot based on aiohttp, processes multiple URLs in parallel
Oleg Broytman [Sun, 18 Aug 2024 20:52:21 +0000 (23:52 +0300)]
Feat(Robots): Make all robots async
Split check_bookmark() into sync and async variants.
Oleg Broytman [Sun, 18 Aug 2024 20:38:52 +0000 (23:38 +0300)]
Refactor(Robots): Split off `multi_mixin`
Oleg Broytman [Sun, 18 Aug 2024 20:28:56 +0000 (23:28 +0300)]
Style(bkmk_rmultirequests): Renamed max_workers to max_urls
Oleg Broytman [Sun, 18 Aug 2024 19:57:57 +0000 (22:57 +0300)]
Refactor(Robots/base): Simplify `X-User-Agent`
Oleg Broytman [Sun, 18 Aug 2024 19:42:02 +0000 (22:42 +0300)]
Refactor(Robots): Rename `bkmk_robot_base.py` -> `base.py`
Oleg Broytman [Fri, 16 Aug 2024 16:40:28 +0000 (19:40 +0300)]
Build: Use Python 3
Oleg Broytman [Fri, 16 Aug 2024 16:39:04 +0000 (19:39 +0300)]
Version 5.7.0
Oleg Broytman [Fri, 16 Aug 2024 14:05:48 +0000 (17:05 +0300)]
Fix(bkmk_robot_base): Redraw progress bar after unhandled exception
Oleg Broytman [Fri, 16 Aug 2024 13:53:31 +0000 (16:53 +0300)]
Fix(bkmk_rmultirequests): Limit number of URLs to load into workers
Oleg Broytman [Fri, 16 Aug 2024 13:37:42 +0000 (16:37 +0300)]
Feat(bkmk_robot_base): Send our version in `X-User-Agent` header
Oleg Broytman [Fri, 16 Aug 2024 13:36:55 +0000 (16:36 +0300)]
Refactor: Move version from `setup.py` to `bkmk_objects.py`
Oleg Broytman [Fri, 16 Aug 2024 13:36:09 +0000 (16:36 +0300)]
Build: Install `setuptools` for `setup.py`
Oleg Broytman [Fri, 16 Aug 2024 13:21:51 +0000 (16:21 +0300)]
Feat(bkmk_raiohttp): Use siosocks for aioftp
Oleg Broytman [Fri, 16 Aug 2024 13:15:06 +0000 (16:15 +0300)]
Feat(bkmk_raiohttp): Use aiohttp-socks
Oleg Broytman [Fri, 16 Aug 2024 12:46:50 +0000 (15:46 +0300)]
Feat(Robots): Removed connect_timeout, added ftp_timeout
Oleg Broytman [Thu, 15 Aug 2024 22:59:22 +0000 (01:59 +0300)]
Fix(bkmk_raiohttp): Don't list FTP recursively
Oleg Broytman [Thu, 15 Aug 2024 22:36:07 +0000 (01:36 +0300)]
Feat(bkmk_rrequests): Use ftplib directly, without requests_ftp
Oleg Broytman [Thu, 15 Aug 2024 21:59:06 +0000 (00:59 +0300)]
Refactor(bkmk_raiohttp): Remove unused values
Oleg Broytman [Thu, 15 Aug 2024 20:05:29 +0000 (23:05 +0300)]
Refactor(bkmk_robot_base):
Oleg Broytman [Thu, 15 Aug 2024 20:04:46 +0000 (23:04 +0300)]
Chore(bkmk_parser): Fix year
Oleg Broytman [Thu, 15 Aug 2024 17:52:45 +0000 (20:52 +0300)]
Version 5.6.1: Minor fixes
Oleg Broytman [Sun, 11 Aug 2024 18:24:52 +0000 (21:24 +0300)]
Fix(bkmk_ph_lxml): Catch `ParserError`
Oleg Broytman [Sun, 11 Aug 2024 18:12:43 +0000 (21:12 +0300)]
Fix(bkmk_robot_base): Decode base64 bytes to unicode
Oleg Broytman [Thu, 8 Aug 2024 17:52:48 +0000 (20:52 +0300)]
Fix(bkmk_rrequests): Not all error codes have messages
Oleg Broytman [Thu, 8 Aug 2024 13:25:33 +0000 (16:25 +0300)]
Feat(Robots): Robot based on requests and concurrent.futures
Processes multiple URLs in parallel.
Oleg Broytman [Thu, 8 Aug 2024 04:45:58 +0000 (07:45 +0300)]
Feat: Dropped support for Python 2
Oleg Broytman [Thu, 8 Aug 2024 04:31:23 +0000 (07:31 +0300)]
Feat(Robots): Remove urllib-based robots
Oleg Broytman [Wed, 7 Aug 2024 22:17:09 +0000 (01:17 +0300)]
Style(Writers/bkmk_wflad): Rename loop variables
Oleg Broytman [Wed, 7 Aug 2024 22:14:28 +0000 (01:14 +0300)]
Refactor: Extract the common list of attributes; `copy_bkmk()`
Oleg Broytman [Wed, 7 Aug 2024 21:51:21 +0000 (00:51 +0300)]
Feat(check_urls): Separately report redirects
Oleg Broytman [Wed, 7 Aug 2024 17:00:53 +0000 (20:00 +0300)]
Feat(Robots): Stop the robot ASAP
Process bookmarks after the robot stopped. In the future
there will be robots that check multiple URLs in parallel
so bookmarks cannot be processed inside check loop
but can be queried after.
Oleg Broytman [Wed, 7 Aug 2024 15:28:49 +0000 (18:28 +0300)]
Style(Robots): Rename `check_url` to `check_bookmark`
Also rename `smart_get` to `get_url`.
Oleg Broytman [Tue, 6 Aug 2024 17:52:43 +0000 (20:52 +0300)]
Fix(bkmk-add): Stop the robot
Oleg Broytman [Tue, 6 Aug 2024 16:18:36 +0000 (19:18 +0300)]
Style(bkmk-add): Rename `_robot` -> `robot`
Oleg Broytman [Tue, 6 Aug 2024 16:01:37 +0000 (19:01 +0300)]
Feat(Robots): Do not return error from `check_url()`
Break the entire program with `Ctrl-C`.
Oleg Broytman [Tue, 6 Aug 2024 13:43:58 +0000 (16:43 +0300)]
Feat(bkmk_rrequests): Install socks dependency
Oleg Broytman [Tue, 6 Aug 2024 11:27:46 +0000 (14:27 +0300)]
Feat(bkmk_raiohttp): Use aioftp
Oleg Broytman [Tue, 6 Aug 2024 10:07:50 +0000 (13:07 +0300)]
Fix(Robots): Do not route ftp requests via http(s) proxy
socks5 proxies are ok.
Oleg Broytman [Mon, 5 Aug 2024 12:00:55 +0000 (15:00 +0300)]
Feat(Robots): Robot based on aiohttp
Oleg Broytman [Tue, 6 Aug 2024 06:16:46 +0000 (09:16 +0300)]
Feat(Writers/bkmk_whtml): Mark special folders
Oleg Broytman [Mon, 5 Aug 2024 20:19:03 +0000 (23:19 +0300)]
Feat: Delete `root_folder.linear` before storing
Oleg Broytman [Mon, 5 Aug 2024 15:19:26 +0000 (18:19 +0300)]
Fix(bkmk_robot_base): Convert environment parameters to integer
Oleg Broytman [Mon, 5 Aug 2024 14:33:33 +0000 (17:33 +0300)]
Feat(bkmk_robot_base) Cut long `data:` icon URLs for logs
Oleg Broytman [Mon, 5 Aug 2024 12:56:23 +0000 (15:56 +0300)]
Fix(bkmk_rcurl): IDNA-encode URLs
PycURL doesn't encode URLs itself
and requires URLs to be in ASCII encoding.
Oleg Broytman [Mon, 5 Aug 2024 10:41:08 +0000 (13:41 +0300)]
Refactor(Robots): Connect timeout
Oleg Broytman [Fri, 2 Aug 2024 10:42:11 +0000 (13:42 +0300)]
Version 5.4.0: Robot based on PycURL
Oleg Broytman [Fri, 2 Aug 2024 10:44:15 +0000 (13:44 +0300)]
Build: Add `devscripts`
Oleg Broytman [Thu, 1 Aug 2024 16:10:30 +0000 (19:10 +0300)]
Fix(bkmk_robot_base): Fix reporting proxy error
Oleg Broytman [Thu, 1 Aug 2024 10:03:16 +0000 (13:03 +0300)]
Feat(Robots): Update X-User-Agent header
Oleg Broytman [Thu, 1 Aug 2024 10:03:03 +0000 (13:03 +0300)]
Feat(Robots): Upgrade headers
Oleg Broytman [Thu, 1 Aug 2024 08:40:32 +0000 (11:40 +0300)]
Docs(bkmk_robot_base): List robots
Oleg Broytman [Thu, 1 Aug 2024 07:26:13 +0000 (10:26 +0300)]
Fix(bkmk_rrequests): Check `r is not None`
It seems `if r` is not enough --
`bool(r)` returns `False` in case there was an HTTP error.
Oleg Broytman [Thu, 1 Aug 2024 05:06:27 +0000 (08:06 +0300)]
Style(setup.py): Remove unused import
Oleg Broytman [Thu, 1 Aug 2024 04:34:11 +0000 (07:34 +0300)]
Feat(get_url): Parse args, save/print headers/body
Oleg Broytman [Thu, 1 Aug 2024 04:19:34 +0000 (07:19 +0300)]
Feat(robots): Report robot being used
Oleg Broytman [Wed, 31 Jul 2024 22:47:45 +0000 (01:47 +0300)]
Feat(bkmk_robot_base): Report error on getting icon
Oleg Broytman [Wed, 31 Jul 2024 22:25:07 +0000 (01:25 +0300)]
Feat(get_url): Print headers
Oleg Broytman [Wed, 31 Jul 2024 18:21:48 +0000 (21:21 +0300)]
Update docs
Oleg Broytman [Wed, 31 Jul 2024 18:12:59 +0000 (21:12 +0300)]
Fear(robots): Try robots from a list
Default list is curl,requests,forking.
Oleg Broytman [Wed, 31 Jul 2024 17:29:29 +0000 (20:29 +0300)]
Feat(Robots): Robot based on PycURL
Oleg Broytman [Wed, 31 Jul 2024 16:23:22 +0000 (19:23 +0300)]
Fix(bkmk_robot_base): Do not pass `localhost` via proxy
Oleg Broytman [Wed, 31 Jul 2024 16:22:58 +0000 (19:22 +0300)]
Feat(bkmk_rurllib): Use proxy
Oleg Broytman [Wed, 31 Jul 2024 16:21:58 +0000 (19:21 +0300)]
Style(bkmk_rurllib2): Remove unused import
Found by `flake8`.
Oleg Broytman [Wed, 31 Jul 2024 15:49:11 +0000 (18:49 +0300)]
Refactor(Robots): Move proxy handling to base class
This greatly simplifies robots.
Oleg Broytman [Wed, 31 Jul 2024 15:14:05 +0000 (18:14 +0300)]
Feat(Robots): Return HTTP status code
Oleg Broytman [Fri, 26 Jul 2024 10:12:14 +0000 (13:12 +0300)]
Docs(TODO): Robot(s) that test many URLs in parallel
Increase task priority.
Oleg Broytman [Fri, 26 Jul 2024 10:11:22 +0000 (13:11 +0300)]
Docs(TODO): Robot based on aiohttp
Oleg Broytman [Fri, 26 Jul 2024 01:32:50 +0000 (04:32 +0300)]
Add `setup.cfg` and `setup.py`
Mostly to list required and optional dependencies.
Oleg Broytman [Wed, 24 Jul 2024 21:01:18 +0000 (00:01 +0300)]
Fix(bkmk_db-venv): Do not exit
This is not a shell script, this is a sourced file.
Oleg Broytman [Wed, 24 Jul 2024 20:44:41 +0000 (23:44 +0300)]
Chore: Rename `bkmk-venv` to `bkmk_db-venv`
Oleg Broytman [Wed, 24 Jul 2024 20:43:45 +0000 (23:43 +0300)]
Chore(bkmk-venv): Rename `.venv` to `bkmk_db-venv`
Oleg Broytman [Wed, 24 Jul 2024 02:26:03 +0000 (05:26 +0300)]
Feat: Cleanup redirects
Remove verbiage.
Oleg Broytman [Wed, 24 Jul 2024 02:05:56 +0000 (05:05 +0300)]
Feat: Skip URLs that have '%s'
Oleg Broytman [Wed, 24 Jul 2024 01:47:39 +0000 (04:47 +0300)]
Fix: These are not errors, just duplicates
Oleg Broytman [Tue, 23 Jul 2024 10:08:11 +0000 (13:08 +0300)]
Build(Robots/bkmk_rrequests): Use HTTP(S) proxy instead of SOCKS5
Oleg Broytman [Wed, 6 Mar 2024 15:43:48 +0000 (18:43 +0300)]
Fix(Robot): Stop splitting and un-splitting URLs
Pass `bookmark.href` as is.
Oleg Broytman [Wed, 6 Mar 2024 15:36:17 +0000 (18:36 +0300)]
Fix(get_url): Remove excessive printing
`robot.get()` doesn't really fill the bookmarks,
`robot.check_url()` does but we don't call it here.
Oleg Broytman [Wed, 6 Mar 2024 15:35:03 +0000 (18:35 +0300)]
Rename `check_url.py` to `check_urls.py`