Analysis: An Honest Review of Gatsby

Here’s an article exposing several flaws of Gatsby as a framework, from the experience of applying it to a documentation site (using MDX)

Annotated version → https://diigo.com/0iof0d

Some key ideas from the article:

Gatsby builds are slow

At this point though it was becoming clear build times were a problem

You’d spend at least 5 minutes on image optimization alone, with no way to even disable that

we were crippling our iteration speed as well, since the build cache would invalidate under a variety of situations in early development

It seems build times is a real issue as it cannot be customized to fit different needs. For example, it seems you cannot disable Image optimization in the build (in case you have handled that by another method)

Deploy is not flexible

We started off leveraging what we had already done: deploying Jekyll with Docker onto our own infrastructure - effectively just proxied via a CDN

Also, it seems Gatsby is not flexible to adapt it to custom infrastructures

PS: I don’t fully understand why they couldn’t re-use their Docker infrastructure

GraphQL is a real pain for some users

trying to get objects modeled into a format that works in a semi-correct way in Gatsby is terrifying

It seems GraphQL is not really massively used by developers and it’s being a pain adopting it for some users. Besides this, it also seems is not fitting well for some use cases

GraphQL doesn’t really make sense for a static site generator

It’s a static website generator. It literally does not need GraphQL all over the place

it shouldn’t require a GraphQL API to read objects that are already in memory

I never thought of this but it’s true. One of the advantages of GraphQL is that you can get a customized JSON with only the data you need, reducing a lot the size of the JSON requested to the server and improving the speed and developer experience (less transformations of data will be needed)

But with a Static Site Generator this have less sense as most of the requests will be done in build time. Only those requests done in Client Side will fully benefit from the advantages of using GraphQL

So, conclusion on this → Using GraphQL on a SSG seems to be over-engineering

MDX is not as flexible as it seems

MDX is supposed to ease the integration of JSX code in Markdown documents but when taken to real world problems it seems there are some limitations

I don’t need JSX syntax in my markdown, I need a way to include JSX components. That might sound similar, but its quite a different thing.

It seems MDX is not succeeding in: allowing the inclusion of JSX components in markdown components


I share these ideas as a way to start a discussion and gather feedback from the team over these points, and maybe take these drawbacks into account for

  • Marketing purposes: pushing messages where Frontity can be a solution over some of these issues: build times for example
  • Development purposes: Flexibility of the framework can a real asset for framework adoption
1 Like

I think the main benefit that Gatsby extracts from using GraphQL is that the developer has to clearly define the data needs for each page type, and those data needs can be the join of multiple sources. Then, Gatsby saves the returned data in a page.json which is linked to each URL and saved statically along with the HTML.

I think this makes sense for a SSG, where the main benefit is that you can shut down your data server.

In Frontity, the framework doesn’t know about the specific data needs of a page. The closer we get is the fetches inside our handlers. I guess we could bake something similar, but I don’t see the need for that yet.

After all, if you generate static HTML files and are able to “observe” the data source, it should be easy to extract the list of used URLs and cache them individually, instead of “per page” like Gatsby does. I explain this a little further in this video I made yesterday for Strattic which will face the same problem if they ever want to integrate with Frontity: https://www.youtube.com/watch?v=tpheDFf37z4

1 Like

Regarding this, there are projects like this one

which allows the developer to create a gateway and create custom endpoints to join data from different endpoints or even joining data from different sources using REST API’s

so, for further references, in case someone argues a migration to GraphQL to do something like this.

Yes, It makes sense. I didn’t think of this.

Very interesting! :+1: If the REST API is also cached, just by running all the URL’s available, the JSON responses will also be cached, which will make it work completely from CDN servers

I understand that additional requests made from the Client Side doesn’t need to be cached or taken into account here, right?

How would a list of used URLs could be extracted? Is this something that can be done from Wordpress?

And another thing… is there a chance you kept the link of the diagram used in the video? :blush:
I’d like to add it here → https://www.notion.so/frontity/2245c3ceff8f485e9eb38828f42c8aba?v=4cabf47cedb44246b2f78cb52204acea

Those client-side-only requests won’t be cached until a user visits the page.

But if you are trying to generate all the static assets by doing an HTTP request to the list of all the URLs, then no, these won’t be cached. That also happens in GatsbyJS by the way.

To avoid that problem, you could either do those requests manually, or use a crawler that not only does the HTTP request but also loads the JavaScript and navigates the site.

Yes, it could be done. You have the list of links in the sitemap, although some links can be missing. You could also get it from the REST API.

I’m not sure exactly how services like Strattic do so, but when I talked with the guy that created https://web2static.com/ he told me that it was a combination of everything (REST API, Sitemaps, Crawler).

In Gatsby, is the user the one that needs to provide a query to get those links.

Done! :slightly_smiling_face:

1 Like