🐞 Destination databricks: update jdbc driver to patch log4j by tuliren · Pull Request #7622 · airbytehq/airbyte (original) (raw)
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Conversation21 Commits15 Checks0 Files changed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters
[ Show hidden characters]({{ revealButtonHref }})
What
- Update the Spark JDBC driver to patch Log4j vulnerability.
- Download the driver file automatically in
test
andpublish
commands. - The new version will be published in 🎉 Destination databricks: update fields specifications #9153.
Contributor Author
tuliren commented
•
edited by github-actionsbot
Loading
jrhizor temporarily deployed to more-secrets
Inactive
tuliren temporarily deployed to more-secrets
Inactive
Contributor Author
tuliren commented
•
edited by github-actionsbot
Loading
tuliren temporarily deployed to more-secrets
Inactive
jrhizor temporarily deployed to more-secrets
Inactive
Contributor Author
tuliren commented
•
edited by github-actionsbot
Loading
tuliren temporarily deployed to more-secrets
Inactive
jrhizor temporarily deployed to more-secrets
Inactive
Contributor Author
tuliren commented
•
edited by github-actionsbot
Loading
jrhizor temporarily deployed to more-secrets
Inactive
tuliren temporarily deployed to more-secrets
Inactive
@davinchia, I manually created an io-airbyte-build-dependencies
bucket in GCS. Should I do it through some Terraform script instead?
Question from me:
- Did you do anything special to grant the spec service account permissions to this bucket?
- Maybe this is a dumb question - can we put the JDBC driver into our CloudRepo Maven repository instead? This way all our public java artifacts are in one place and we don't have an additional bucket to manage.
Did you do anything special to grant the spec service account permissions to this bucket?
I granted this service account the read permission to this bucket on the permission page on GCS.
Can we put the JDBC driver into our CloudRepo Maven repository instead? This way all our public java artifacts are in one place and we don't have an additional bucket to manage.
- Where is this CloudRepo Maven repository?
- What do you mean by "public"? This jar cannot be publicly downloadable by anyone from the internet. When downloading it, people need to accept to a certain terms & conditions.
- The jar is publicly available here. I just don't know if we can download directly from that link programmatically. So if we can put it somewhere private, we won't have the potential legal issues.
CloudRepo is the Java hosting service we are using: cloudrepo.io. Login credentials are in Lastpass under cloudrepo.io
in the shared-engineering
folder. This is where java artifacts from OSS are pushed to.
Got it. I didn't realise we want to keep this private. If you aren't rushing, I think I would prefer creating a private repository in CloudRepo and set things up so we publish and pull from there. Main thing here is to keep all the java artifacts in one place. I can take a stab at this on Monday/Tuesday if you aren't free.
What do you think?
Got it. Thanks. I think I will just change it to download from Databricks each time. That does not seem to be breaking the terms & conditions. It's not worth the time to be too careful. But thank you all the same.
Okay! Happy to relook when you make those changes. Thanks Liren.
tuliren temporarily deployed to more-secrets
Inactive
Contributor Author
tuliren commented
•
edited by github-actionsbot
Loading
tuliren temporarily deployed to more-secrets
Inactive
jrhizor temporarily deployed to more-secrets
Inactive
Contributor Author
tuliren commented
•
edited by github-actionsbot
Loading
tuliren temporarily deployed to more-secrets
Inactive
@@ -77,7 +77,7 @@ public StandardCheckConnectionOutput run(final StandardCheckConnectionInput inpu |
---|
LOGGER.debug("Check connection job received output: {}", output); |
return output; |
} else { |
throw new WorkerException("Error while getting checking connection."); |
throw new WorkerException(String.format("Error checking connection, status: %s, exit code: %d", status, exitCode)); |
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jrhizor, a check operation failed in the acceptance test. I added more information to the exception message and saw this:
Error checking connection, status: Optional[io.airbyte.protocol.models.AirbyteConnectionStatus@605d010e[status=SUCCEEDED,message=<null>,additionalProperties=***]], exit code: 143
Do you know what might be the root cause behind this exit code? Could it be related to the recent change in WorkerUtils#gentleClose
?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the gentle close changes look related. Are you sure this docker container isn't actually outputting an error code?
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is what happens:
- The database connection somehow requires an explicit
close
method call, otherwise, the connector will not shutdown. - The worker will wait for 1 minute for the connector to shutdown, otherwise it will close it by force (code here) by sending a
SIGTERM
. - The
SIGTERM
results in a 143 exit code.
So the root cause is that we must always close the database at the end for Databricks, which was not the case in the default check command implementation in CopyDestination
. The issue has been resolved.
Contributor Author
tuliren commented
•
edited by github-actionsbot
Loading
tuliren temporarily deployed to more-secrets
Inactive
jrhizor temporarily deployed to more-secrets
Inactive
tuliren temporarily deployed to more-secrets
Inactive
tuliren changed the titleDownload jdbc driver for databricks connector 🐞 Destination databricks: update jdbc driver to patch log4j
tuliren temporarily deployed to more-secrets
Inactive
tuliren linked an issue
that may beclosed by this pull request
tuliren deleted the liren/auto-download-spark-jdbc branch
Reviewers
jrhizor jrhizor left review comments
sherifnada Awaiting requested review from sherifnada
davinchia Awaiting requested review from davinchia