fix DISTINCT for Oracle databases by trollknurr · Pull Request #2935 · encode/django-rest-framework (original) (raw)

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation24 Commits6 Checks0 Files changed

Conversation

This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.Learn more about bidirectional Unicode characters

[ Show hidden characters]({{ revealButtonHref }})

fix error "DatabaseError: ORA-00932: inconsistent datatypes: expected - got NCLOB" for oracle db

Presumably this fix means we'll just have silently broken behavior, by not including the required distinct?

For more background on this. From docs:

LOB columns may not be used in a SELECT DISTINCT list. This means that attempting to use the QuerySet.distinct method on a model that includes TextField columns will result in an error when run against Oracle. As a workaround, use the QuerySet.defer method in conjunction with distinct() to prevent TextField columns from being included in the SELECT DISTINCT list.

@jpadilla QuerySet defer(field_name) method will not suit in case we want to search on this field_name

DRF 2.4 works fine without distinct.

I believe that the distinct was added due to a bug being found since the 3.0 release. I've no reason to believe that it wasn't also present in 2.x, but it only manifests with certain lookups.

(Someone would need to track down the issue, I don't remember exactly, but git blame would probably point in the right direction)

I believe that the distinct was added due to a bug being found since the 3.0 release.

Confirmed.

Right - so I'd assume merging this fix would also mean that we'd be silently breaking #2535 for users of Oracle? I assume there's no good solution here?

That's right, i broked #2535. Here updated fix.
Using set give one more query overheat, but Oracle users want to use SearchFilter

tomchristie

queryset = queryset.distinct()
else:
pk_list = queryset.values_list('pk', flat=True)
queryset = view.model.objects.filter(pk__in=set(pk_list))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using view.model.objects here feels a little bit brittle. I think we can avoid needing that. Does something like the following make sense?...

if settings.DATABASES[queryset.db]["ENGINE"] != "django.db.backends.oracle":
    return queryset.filter(reduce(operator.or_, or_queries)).distinct()
else:
    pk_list = queryset.filter(reduce(operator.or_, or_queries)).values_list('pk', flat=True)
    return queryset.filter(pk_in=set(pk_list))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note here: Since when is the model attribute back on views? I thought that was removed.

Avoided using view.model.objects & little enchancements

tomchristie

@@ -100,13 +101,17 @@ def filter_queryset(self, request, queryset, view):

orm_lookups = [self.construct_search(six.text_type(search_field))
for search_field in search_fields]

and_queries = []

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand the motivation here - feels more obscure after this change rather than more clear.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't sure, if calling queryset.filter(pk_in=set(pk_list)) will make duplications go away, as they pk still will be in pk_list. I want to filter original queryset by pk_list.
When we call filter with multiple kwargs or several times on one queryset it is equal to logical and.

tomchristie

and_queries.append(reduce(operator.or_, or_queries))

if and_queries:
if settings.DATABASES[queryset.db]["ENGINE"] == "django.db.backends.oracle":

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want a brief comment here regarding needing this behavior for oracle and distinct.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomchristie According to Oracle DB limits there is no capability to make a DISTINT on *LOB. There is a need to use SearchFilter with TextField (for example). That's why i pull this workaround for Oracle.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarification: I was meaning that we should have a code comment for this.

@xordoquy thinking we can also milestone this for next release as it seems to be 99% completed.

I won't have the time to review it.
I'll try to get the release out by tomorrow evening which doesn't leave much time.

tomchristie

for search_term in self.get_search_terms(request):
or_queries = [models.Q(**{orm_lookup: search_term})
for orm_lookup in orm_lookups]
queryset = queryset.filter(reduce(operator.or_, or_queries)).distinct()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer it if we didn't introduce the and_queries. We should keep each pull request as absolutely minimal as possible. Seems like there's two different changes here - the Oracle support as one, and the change in the filtering style as another. Let's just adopt the first of those two.
Apologies for the slow progress on this, but thanks for getting it nearly there! :)

@tomchristie i'm not sure about failed test - is it my fault?

kevin-brown

@@ -152,7 +163,7 @@ def remove_invalid_fields(self, queryset, fields, view):
field.source or field_name
for field_name, field in serializer_class().fields.items()
if not getattr(field, 'write_only', False)
]
]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't need to be indented another level.

This is what is causing Flake8 to fail by the way.

This was referenced

Mar 9, 2017

This was referenced

Oct 6, 2017

This was referenced

Oct 16, 2017

This was referenced

Nov 6, 2017

This was referenced

Nov 14, 2017

This was referenced

Dec 10, 2017