Skip to content

Improper URL Generation #253

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
2 of 5 tasks
dopry opened this issue Aug 10, 2022 · 2 comments
Open
2 of 5 tasks

Improper URL Generation #253

dopry opened this issue Aug 10, 2022 · 2 comments
Labels
help wanted Up for grabs for the community. Probably not a "good first issue", unless labelled as such too Priority: Medium

Comments

@dopry
Copy link
Collaborator

dopry commented Aug 10, 2022

Wagtail grapple prepends most URLs with BASE_URL to create absolute URLs. It doesn't seem like wagtail grapple should be responsible for generating these absolute URLs. It seems a responsibility more suited to a type-specific model class or a consumer of the API. wagtail grapple should have a single responsibility of dynamically exposing internal types and fields to graphene and resolving them using the appropriate models and fields. It shouldn't be responsible for manipulating the data itself.

There are 5 types for which wagtail-grapple manipulates the URLs.

  • embeds

    grapple.types.streamfield.get_media_url

    This implementation of get_media_url is used solely with embeds. Current Embeds all require fully qualified URLs.

    The only reason I would anticipate to support a relative url would be embedding local content. In that case there would need to be an embed finder that could match the relative url. If it stored a relative URL rather than fully qualified, I would expect that finder to be responsible for providing the absolute URL as well. Generally speaking wagtail-grapple shouldn't try to manipulate embed URLS.

  • media

    grapple.utils.get_media_item_url

    Media is not always hosted with the wagtail site. Most large sites I've worked on use cloud storage such as S3 and/or a Media CDN.

    Media.url calls through AstractMedia.file -> FileField.url -> Storage.url

    django.core.files.storage.FileSystemStorage

    def url(self, name):
         if self.base_url is None:
             raise ValueError("This file is not accessible via a URL.")
         url = filepath_to_uri(name)
         if url is not None:
             url = url.lstrip('/')
         return urljoin(self.base_url, url)
    
    @cached_property
    def base_url(self):
         if self._base_url is not None and not self._base_url.endswith('/'):
             self._base_url += '/'
         return self._value_or_setting(self._base_url, settings.MEDIA_URL)

    I'm the Storage implementation is responsible for constructing a full URL. We shouldn't need the if url[0] == "/": branch. If we did enter it we should probably be preferencing MEDIA_URL to WAGTAIL_ADMIN_BASEURL and checking the _base_url property of the relevant storage.

  • image

    wagtail.images.models.[WagtailImage,WagtailImageRendition] also use storage and are returning file.url. The same things that apply to media apply to images. There are also some sheninigans with grapple.types.images calling get_media_item_url which checks self.url and self.file.url rather than just explicitly returning self.file.url for image and self.url renditions.

  • documents

    AbstractDocument.url is an outlier. It serves up file.url when WAGTAILDOCS_SERVE_METHOD is direct, then falls back to reverse("wagtaildocs_serve"). Prepending BASE_URL may be valid in this case, but that is assuming I'm not serving this site off an alternate domain and that the site url isn't preferred, and that I am not pushing documents through the same CDN as the rest of my media.

  • redirects

    grapple.types.redirects.RedirectType.resolve_old_url

    I'm not sure we need to add a base_url here either. Shouldn't this include the appropriate site for the old path? self.site.root_url()+self.old_path This doesn't seem to take into account the possibilty of all sites as well. old_url isn't actually a property of the redirect. I feel like we should be returning site and urlPath only here... We shouldn't be adding/removing properties on the native Wagtail models.

Originally posted by @dopry in #234 (comment)

@zerolab zerolab added help wanted Up for grabs for the community. Probably not a "good first issue", unless labelled as such too Priority: Medium labels Nov 29, 2022
@kbayliss
Copy link
Collaborator

Thanks for raising this, @dopry.

@zerolab we're going to start working on redirects soon as we need them within a multi-site setup. Is there anything we should be aware of before we start?

@dopry
Copy link
Collaborator Author

dopry commented Nov 20, 2023

@kbayliss the URL generation issues in wagtail are more generalize than just redirects so bears some thought. see #274 and #306

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Up for grabs for the community. Probably not a "good first issue", unless labelled as such too Priority: Medium
Projects
None yet
Development

No branches or pull requests

3 participants