Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Blades – fast static site generator written in Rust (getblades.org)
146 points by marosgrego on Oct 12, 2020 | hide | past | favorite | 90 comments


I'm going to try this out at some point this week, but the one item that makes me pause: the sitemap. Hugo is fast, but it doesn't do incremental builds. Once you get to the point where you're publishing 10's of thousands of posts, Hugo starts to get real slow (like 20 minutes to build, even after enabling template caching), so an increase in speed during build time is welcome. But once you get to 50K posts you realize that Hugo doesn't follow Google's convention to break your sitemap into separate files with an index map[0]. Hugo doesn't have a handler for this and as of now I haven't figured out how to modify the sitemap compilation process to be compliant (not like I've tried too hard, I'm not native with Go and the code isn't clear to me where the sitemap is being built). If Blades would handle the sitemap issue for me without needing a plugin or modification, that would be huge for a lot of people on the Hugo community boards.

[0] https://support.google.com/webmasters/answer/183668?hl=en


Blades currently doesn't break the sitemap, but it can easily be implemented. I see that 50k pages is a maximum for one sitemap file. Do you think it should break into separate pages only after reaching 50k, or sooner?


IMO the easiest implementation would always break into an index and sitemap(s) as needed. 49,999 URLs? 1 index and 1 sitemap. 50K+ URLs? 1 index and sitemaps as needed. This gives you the most consistent output and users who cross the 50K threshold over time will not need to update their sitemap in search tools to be compliant since pointing to the index will already deal with that.


You are not wrong, although I would argue that use cases where you have 50k blog posts (or pages) are not that common...


I should have been more clear on the terminology - 50K URLs. Which doesn't always mean 50K posts or pages, but those plus all your taxonomy pages plus all your pagination pages. A 10K post or page site can give you 50K URLs pretty easily if it uses taxonomies and pagination.

My case might be unique where the site is being built at the end of a pipeline that sorts through millions of rows of data and gives you the content you want based on your filters, but it's honestly not difficult to get to 50K URLs for many sites that have some age on them.


I would say a 10k page or post site is also very rare.


Seems to me like every newspaper would accumulate 10k pages within a year or two?


I think static generators are mostly used for personal blogs or pages, but I'd love to hear about newspapers that use them!


Nah, you'd be wrong. I use a password protected static site as my knowledge dump. I accumulate 1000s of posts in months.


You doing it is not the definition of ‘not rare’ :P


I think the conversation here is getting a little sidetracked :)

(not indicating Aeolun specifically, just replied to the lowest level)


Just run my company's internal Hugo build on commodity hardware (4 cores), and I get about 1000 pages per second, i.e. a 3966 page site took "4841ms" - so 20,000 pages assuming linear scaling, would still only 20-30 seconds. Not minutes. Perhaps our theme isn't complex enough?


Interesting. Are you using Related [0] content? I always suspected the long build was due to related content but the build profiler isn't specific enough to point that out.

[0] https://gohugo.io/content-management/related/


I've faced this issue as well.

I run a Python script after the build has completed that reads the entire sitemap and chunks it into parts of 50k.

+1 for mentioning incremental builds as well. These two features are often overlooked and people just ask "why do you even have so many pages?" and it ends there because nobody wants to entertain the power user.

A third disadvantage I've seen is search. I use fuse to generate a JSON for client side search but it takes a lot of memory to render and is a cause for concern.


What kind of sites are you building with static site generators that has 10K+ pages of content?

I always got the impression that they were more for small blogs/sites.


You can take your slow dynamic site and throw some caching in front of it or you can just skip all the moving parts and make the whole thing static from the start and let nginx do what it does best and serve static files super fast. I've done a lot of work for media and news sites that previously ran on WordPress and it is not uncommon to see 10K posts or more. I would argue static site generators are considerably underused for many types of sites that genuinely don't need an active datastore connection.


Why are you assuming a dynamic site is slow?

Everything after the first request is cached by a CDN anyway so the only difference is when that original page is generated. It only takes a few milliseconds to generate the page from a database vs reading a file from disk.

The bigger advantage to static sites is portability (even though you can have a webapp that reads markdown files just as easily as database rows).


This is actually very interesting. Can you tell more on the recommended stack for that kind of websites? You said media but I guess also directories


There are so many options but it genuinely depends on how your content is being generated. If you have everything in a giant CSV or JSON file then there's not much work to do at all. Many shops I worked with were used to WordPress and after WP released the API functionality it became possible to completely decouple the editing & publish function from the front end display. You can still use a securely locked-down WP install for content creation and editing, then just use the (also locked-down) WP API endpoint as your data source that gets fed into your site generator. Hugo has an awesome feature where if you feed it a URL it will go fetch your content for you, so it doesn't even need to exist in the /content dir prior to build. Thanks to the free tiers for Gitlab CI/CD or Github Actions, your repo only contains theme code and with a 10-line CI/CD config file and a container image that has Hugo installed, you just need a trigger to run the build pipeline - then you're left with the static site that you can publish anywhere you like.

If you don't need the user roles and editor experience you get from WP then you can feed your generator the content from any source you like as long as the files have the correct frontmatter to turn them into pages/posts. You might need a little glue script here and there to chain systems together, but it's not super difficult to adapt to whatever pipeline you need after you understand where your content comes from.



This is neat! Good work. Was any of this inspired by Zola? https://www.getzola.org/documentation/getting-started/overvi...

I’ve used Zola and love the build speed and power that it has. My only real complaint is common mark as the markdown engine. I prefer the tera/jinja2-esque template syntax over ramhorn/mustache-esque that’s used here.

I’ll have to try and port a theme to blades to give it a fair assessment.


Certainly, I have known Zola for a while and it has a nice feature set.


It would be great if every new static site generator had a page that has comparisons with the ones that came before...at least for the ones in the same language and same deployment model (or dependencies). After going through this site, I was left wondering (as someone who hasn't used Rust based SSGs) what the differences are between Blades, Zola and Cobalt.

The site says that it's a hobby project, but additional marketing with comparisons is always useful to get more eyes (and possibly more contributors too).


Nice work, thank you for sharing!

Great job on the 386 theme.

Documentation is understandable and easy to dive into.

Great design, and rather complete feature-set.

TOML looks like a great format, too.


All I want is a static site generator that sits in an executable file in my project's directory and doesn't require any dependencies so that I can run `./ssg build` and have all of my Markdown files interpolated into my template files and put in their own directory. I don't want to have to install all of the dependencies for Ruby or Hugo or Blades, cool as this project looks. Are there any SSGs that fill that niche?

(A Python executable would be fine since pretty much every Linux computer has Python, but I don't want to have to deal with pip).


That's pretty much what hugo[1] is. Download the .tar.gz and put the executable wherever you want.

[1] https://github.com/gohugoio/hugo


Thanks. I had looked at Hugo before, but I guess I just didn't see the binary install option. (https://gohugo.io/getting-started/installing/#binary-cross-p...)


Do save yourself some head-scratching and time, and download the hugo_extended package.


Check out https://www.getzola.org. It's a single executable. It even does syntax highlighting (without using client-side javascript).

I moved to it from Jekyll/Hugo, because I hate dealing with Ruby and dependencies like you!


There seems to be apathy in Rust community towards using client-side js for syntax highlighting. Rust documentation and The Rust Book uses it for no good reason. Code in examples doesn't change, and the way they are highlighted doesn't change. In my opinion it clashes with the philosophy of the language - that as much as possible should be done before the code is run.

Aside from my pedantry, it causes computers to make more unnecessary computations and our civilization emits more carbon. It also affects me because I currently have an old CPU and it takes a few seconds for syntax highlighting to kick in. Every time I press "Back" in my browser, syntax highlighting needs to be recalculated as well.

Oh well, on the bright side I have an itch to scratch :-).


> Code in examples doesn't change, and the way they are highlighted doesn't change.

I'd consider it an accessibility issue - hardcoding the syntax highlighting on the backend makes it more difficult for people to change using accessibility tools. Whereas with js-based highlighting, you can probably just change the theme and get readable output.


What kind of accessibility tools are you thinking of that would work with JS highlighting, but not server-generated? (Given that generally both would produce roughly the same in-browser DOM, which is what most modern accessibility tools work against)


> that would work with JS highlighting, but not server-generated?

Mainly that JS highlighting normally works with a theme variable which is much easier to change to get, e.g., a high-contrast theme than having to muck about with 10-20 different CSS styles.

Ignoring that option, working with a known set of CSS styles in the DOM is going to be easier than dealing with John Q Random's own particular set for highlighting (assuming they haven't used the same as a popular JS highlighter, of course!)

(To my shame, whatever Hugo is now using for syntax highlighting makes an awful mess - everything is "style:color#123456", no semantic information - I'll be fixing that ASAP.)


You seem to be describing accessibility tools for completely blind users, and I think they were describing accessibility tools for users with other sorts of disabilities. For example, someone who uses a custom stylesheet or a browser extension to make the page have higher contrast.


No, the question applies to those just as much.


As far as I know, parsing the accessibility tree is primarily limited to screen readers / brail displays. Am I missing something?


I'm speaking of the normal DOM representation of the HTML in the page. A server-side highlighter generates HTML with css classes or styling attached that gets turned into a DOM by the browser. A JS-based highlighter edits the DOM by adding css classes or attaching styling. The in-browser representation of the end result, which a browser extension likely works with, can be the same, and how it looks like exactly is more important for their functionality than if the server or JS did it.


If you used semantic class names you could change the theme with just css.


apathy - neutral / "don't care"

vs

antipathy - negative / "hatred"


While another word or phrase would be preferable there, the GP did in fact use the word correctly.

They are claiming that the rust team uses client side JS because they don't care about the negative implications of doing so.


Thanks for the suggestion!

Some of the other commenters have pointed out that Hugo also has standalone binaries, I just didn't notice them last time I looked. What made you choose Zola over Hugo?


I think the main reason was the built in syntax highlighting, and simplicity. I've decided to move away from themes, and just code simple CSS by hand - so Zola kind of fits that mindset.

Hugo definitely has more features. If you think you'll want to do some complex hierarchies in your blog/site, or want a lot of themes to pick from I'd go with Hugo.

Hopefully that helps!

Edit: I think Hugo used to require Pygments (a pip package), but it looks like that's no longer the case. If it did that probably nudged me towards Zola too.


I really enjoyed Zola as well. It's sufficiently simple that the learning curve is really gentle but has just enough features to cover all my needs for a static site.

Coupling it with Tailwind for styling meant that developing the theme was really smooth (slowest part of changing a template was waiting for the browser to refresh, i.e. < 1s).


Hugo or any go program should fit that bill since that is one of the major selling points of go

Hugo has a ton of prebuilt executables that you just have to extract and its in a single file https://github.com/gohugoio/hugo/releases


Thanks. I had looked at Hugo before, but somehow didn't see the precompiled binary as an installation option. I'll try it out.


You don't need to use their precompiled binary either. It's available in brew (OS X), some Windows package manager (don't remember the name), distro packages, etc.


Yeah, but package managers are exactly what I want to avoid here :)


I'm not sure why Ruby (dependencies at run time) and Go and Rust (dependencies at compile time) are both lumped together as examples of what you aren't looking for.

Doesn't go or rust give you a static executable?


It looks like Hugo does provide a static executable. I'm aware that Go is known for providing standalone binaries, but last time I checked Hugo's website I didn't notice the binary installation method [0], which is why I mentioned it as equivalent with Jekyll. I'll give it another look.

Regarding Ruby, Gems on Linux is a mess that I don't want to have to deal with for anything, but especially not for a static site generator.

Rust probably could provide a precompiled binary, but the recommended installation method for Blades involves `cargo` which is why I asked the question.

https://gohugo.io/getting-started/installing/#binary-cross-p...


> Rust probably could provide a precompiled binary, but the recommended installation method for Blades involves `cargo` which is why I asked the question.

Cargo is the Rust compiler (or rather the Rust build system). After you run it, it'll spit out a binary which you can use elsewhere. If I'm reading your question correctly, this would be analogous to seeing a reference to CMake in a C++ project's README and then asking whether C++ supports precompiled binaries.


Both Go and Rust do not produce static binaries by default -- they both link to glibc and a few other libraries -- but you can coerce them both to do so (though with Rust it's slightly more painful because you need to install a compiler toolchain that uses musl, but rustup makes it pretty painless).


By static I meant go and rust dependencies bundled into the executable. Dynamically linking to libc is ok with me, any system I run on has a usable libc.


Though it should be noted that when you compile such a binary on a system with a newer glibc, it will not work on a system with an older glibc. To be fair this is basically the case with C as well (you can work around it with .symver but it's kind of dodgy), but it is worth keeping in mind.


Yes, this is true for a great many things. If I compile these pseudo-static binaries for use on multiple different linux distributions or libc versions, I compile on the oldest/lowest common denominator so that it runs everywhere I need it to.


If a dependency on Markdown/sh is acceptable, you might find this[0] to be helpful. I wrote it because I was in much the same boat.

[0] https://git.sr.ht/~evan-hoose/SSSSS

[1] http://a-shared-404.com/blog/stupid-simple-sites/


That looks a tad more basic than what I'd like, but I'll give it a try. Thanks for the links!


It is indeed quite basic. You should be able to replace 'markdown' with whatever renderer you want, so theoretically any sort of plaintext -> html tool should work. Alternately, if there's a more full featured markdown renderer you prefer that should work ootb.


They say "If you would like to have it included in your favourite package repository, submit an issue."


I don't know much about Rust. Can it be compiled into a standalone binary? C programs typically rely on header files and are not standalone, and Rust is (basically) C but with safeguards around memory, right?


Yes it can when you use musl instead of glibc


Thanks! Where can I read more about that? I looked at musl's website, but didn't see any mention of standalone binaries on the page comparing it to glibc. [0]

[0]: https://wiki.musl-libc.org/functional-differences-from-glibc...


This is more of a Rust quirk[1] than a glibc/musl thing. Glibc can be used statically (though it's not really recommended -- hence why Rust doesn't support it and requires you to use musl instead).

[1]: https://doc.rust-lang.org/edition-guide/rust-2018/platform-a...



You can do exactly this with Blades. The binary has no dependencies.


Where do you download the binary from?


You could download it from AUR: https://aur.archlinux.org/packages/blades-bin/

Or simply from GitHub releases: https://github.com/grego/blades/releases


what about this?: https://rgz.ee/bin/ssg5 modify it a little bit and run!


That still has an external dependency for Markdown, but yeah, that's basically exactly what I'm looking for. I might just stick Markdown.pl in the git directory with the shell file so that dependencies aren't a problem.

Thanks!

Documentation here: https://rgz.ee/ssg.html


if you like extreme minimalism : https://dataswamp.org/~solene/2019-08-26-minimal-markdown.ht...

unfortunately i don't see any license for that awk script, but still, just look at it!


This is a really weird requirement from my perspective...


Deal with a Jekyll site that sat unattended for two years and was built using the default Ubuntu ruby packages and you'll understand.


Yeah I don't use Jekyll (I use Hugo, although have been looking for a rust option that has a smaller binary), and I rebuild my static site often enough.

But a Python venv and a requirements lockfile can alleviate a lot of the issues you seemed to have.

Seems like placing the generator directly in the repo is a little bit of an overreaction. Maybe it is all you have time/patience for, but I would not consider it a clean practice. (E.g. it would break when you switch CPU architectures, which I have done since I started my site).


I hadn't thought of CPU architecture problems since all of my computers in the past decade or so have used the same architecture, but that's a good point.

I understand sticking the generator in the repo isn't a good practice, but I really just want a simple setup that won't have dependency problems. Probably an executable shell or Python file that doesn't draw in any dependencies would be ideal for me since those can run on pretty much any Linux computer in my experience.

Based off the recommendations here, I'll likely migrate my Jekyll site to Hugo or this one which was recommended by another commenter in the thread: https://rgz.ee/ssg.html

Right now, do you just keep Hugo in your `$PATH`? If you do that, do you need to note in your site's README which version of Hugo it should be built with, or are Hugo versions pretty backwards-compatible with each other?


Hugo is packaged in my distro and I just roll with the updates. Never had an issue before, but it has only been a year so maybe that is not representative of long term usage.


Indeed. Use a static generator properly packaged in a Linux distribution and you'll have reliability and security updates.

Use a self-contined binaries from random sources and goodbye security.


Sounds very reasonable. Some Python/Ruby/JS static site generators come with dozens of dependencies...


Dependencies which can be tracked and locked!


Off topic:

Any aggregated lists of sufficiently mature projects in "new" languages (rust, nim etc)?

That would be helpful for people searching for some idiomatic code.

Also to evaluate performance etc.. characteristics of the language.


I wish for a static site generator with back links https://www.youtube.com/watch?v=9aM3JzBJ6qo


Another static-binary static-site generator I've recently discovered is Soupalt[1]. Which adds OCaml to the list of languages for that kind of thing, with extensibility via Lua. Has a a more unopinionated approach to thing, instead of just markdown expansion, it's also a generic HTML postprocessor.

[1]: https://soupault.neocities.org/


Very interesting project, thanks for sharing!

CommonMark supports HTML tags, so one can currently also write pages in plain HTML if they want to.


Blades author here. Just added it to AUR for more convenient install for people who are not Rust developers: https://aur.archlinux.org/packages/blades-bin/


Hey super nice! I would add some more info to the docs: formats supported (md, rst?, etc), if it is also a blog generator I would include that info. Looks pretty awesome and simple, thanks for sharing!


It only supports CommonMark markdown now. Maybe I should state it more explicitly somewhere.



I'd say just by using YAML this already loses out to the posted project :)


Why do you say that? I find TOML much harder to write compared to YAML.


Yeah, TOML is kind of more verbose. However, its syntax is uniform (`key = value`) and there is no way to shoot yourself in the foot.


Dang this is nice.

I just built two sites, one with Hugo and other in Eleventy, but this is nice and I can see how useful it can be ie. would love to use it for next excuse for website.

Thanks for sharing.


Thanks for the link, it's always good to know about similar alternatives.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: