Issue295 fixes #295 by JessicaTegner · Pull Request #296 · JessicaTegner/pypandoc (original) (raw)
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
This issue fixes a critical error, in download_pandoc, where the urls to the binaries stopped being populated.
This is a suggested solution, that I'll leave open for discussion before merging.
Thanks for the fix!
I always see another error when running the pypandoc download:
[INFO] Downloading pandoc from https://github.com/jgm/pandoc/releases/download/2.19.2/pandoc-2.19.2-1-amd64.deb ... [INFO] Unpacking /tmp/tmpyhpyagw4/pandoc-2.19.2-1-amd64.deb to tempfolder... [INFO] Copying pandoc to /tmp/runner/build-docs/bin ... [INFO] Making /tmp/runner/build-docs/bin/pandoc executeable... [INFO] Copying pandoc-citeproc to /tmp/runner/build-docs/bin ... Error: Didn't copy pandoc-citeproc
This is just a print statement and does the cause the program to crash but still is a bit annoying. Is this something that can be easily disabled?
@kevalmorabia97 this can be disabled indeed. See the README for instructions, specifically under the "logging messages" section.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the quick fix! I tested it on MacOS and it works.
I came across this issue, thanks for the quick fix! I can confirm it works well on Ubuntu 22.04.
Querying api.github.com makes perfect sense to me, and will hopefully be more stable.
@FedeMPouzols good that it also works there.
Issue that I don't like, is that api.github.com has a ratelimiting of 60 requests / hour for non-authenticated users.
This might not seem like a big deal, but consider that before it was unlimite due to reading the html, so bigger pipelines might break if it's ran many times an hour (even our own tests, breaks sometime because of it).
I have not yet found a fix for this, that I like.
I do take suggestions.
@kevalmorabia97 that would solve the rate limiting issue for every version except if using "latest".
Since we are only hitting api.github.com once, if we use "latest" as version, the rate limiting will still happen.
Reason for this, is that we are extracting the asset html download urls from the json data returned from the call.
If we could find another way of getting the latest version number, that didn't involve hitting api.github.com, your solution might just work perfectly.
What if we cache the latest version and let the cache be alive for a minute, this would ensure we hit the api almost once per minute
that could also work, however I think I have found a solution between what you suggested and what I have found after doing some research. Do let me know what you think :)
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is even better! I tested it in MacOS and it works
@kevalmorabia97 also works on Windows here.
I don't like that the tests are failing, but that seems to be for other reasons, not a cause of this pr.
I'll leave this open for a little, to see if other people has comments, while I look at better CI pipeline options.
Sounds good. For now I'm passing a hard coded url to the pandoc_download function
Is it possible to make a new release/tag for this working state?
| response = urlopen(url) |
|---|
| content = response.read() |
| pattern = re.compile(r"pandoc\s*([\d.]+)") |
| version = re.search(pattern, content.decode("utf-8")).group(1) |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be easier/faster to parse the returned URL here:
from urllib.request import urlopen url = "https://github.com/jgm/pandoc/releases/latest" response = urlopen(url) response.url 'https://github.com/jgm/pandoc/releases/tag/2.19.2'
And once you have that URL, you can .replace('tag', 'expanded_assets') to get the one with links.
@enochtangg as a temporary option, you can update your dependencies to pypandoc @ https://github.com/JessicaTegner/pypandoc/archive/refs/heads/master.zip to try that it works on your environment.