Paths, Paths, Paths ...
I started writing this point a few weeks ago, but it never felt finished. There is still so much more to cover. I might revise it when I find some more time, but I hope it is of some use already.
Classpath, :source-paths
, :paths
, :asset-path
, …
There are a lot of different “paths” you’ll encounter when working on a Clojure(Script) project and their meaning can be confusing. Especially beginners often seem to struggle to get the project setup correctly and understanding what they all mean. I hope to clear up some of the confusion around the most common setups you’ll see when working with CLJS.
Classpath
This is the big one and probably the most confusing if you are not coming from a JVM background. This is the basis for all CLJ(S) projects and understanding how this works is crucial.
The Classpath is how content addressing works in the JVM, it tells the system how to find resources. A resource is just a file but to make things a little more distinct I’ll be referring to “files on the classpath” as resources. Clojure and ClojureScript both use this mechanism when translating namespaces to resources and ultimately finding the actual files.
The Classpath is a “virtual filesystem” that combines multiple entries to make it look like one. Each entry is either a directory or a .jar
file. They are basically just.zip
files, so just imagine them as having a few files packed into a zip file with their pathnames intact.
Clojure(Script) use a simple namespacing mechanism which is mostly controlled via the ns
form. Let us dissect a very simple form that you’ll often require in CLJ(S) code.
(ns my.awesome.app
(:require [clojure.string :as str]))
First we specified the namespace name as my.awesome.app
and then we required the clojure.string
namespace. When we talk about names we need to translate those to a resource name and the rules for this are very simple. Replace .
with a /
and then append .cljs
, .cljc
or .clj
depending on what you are looking for. There is also a rule for replacing -
with _
but that is about it.
So we first translate my.awesome.app
to my/awesome/app.cljs
which would be its resource name. clojure.string
we translate to clojure/string.cljs
.
The next step is translating this resource name to an actual filename on disk, which is where the classpath comes into play. The classpath is constructed when the process you are working with is started. Regardless whether you use shadow-cljs.edn
, deps.edn
or project.clj
they all first construct the classpath based on your configuration.
:source-paths
and :dependencies
are the two options that control this in shadow-cljs.edn
. project.clj
via lein
has a few more such as :test-paths
, :resource-paths
, etc. deps.edn
just has :paths
and :extra-paths
and uses :deps
to configure dependencies.
So say we have configured this shadow-cljs.edn
(same applies for all the others, just using this as a simple example)
{:source-paths
["src/dev"
"src/main"
"src/test"]
:dependencies
[[reagent "1.0.0"]]
...}
What the shadow-cljs
command line utility will first download all dependencies such as reagent
, shadow-cljs
, clojurescript
, etc. and put them into the proper place in the ~/.m2
directory. This is convention from the JVM maven
ecosystem, but you’ll likely never need to actually look at it. The dependency is packaged as a .jar
file and it’ll end up at ~/.m2/repository/reagent/reagent/1.0.0/reagent-1.0.0.jar
.
To construct the actual classpath each tool will then combine all the manual paths (eg. :source-paths
) you configured with the dependency .jar
file and construct a list of them
src/dev
src/main
src/test
~/.m2/repository/reagent/reagent/1.0.0/reagent-1.0.0.jar
~/.m2/repository/thheller/shadow-cljs/2.12.1/shadow-cljs-2.12.1.jar
- …
~/.m2/repository/clojure/clojurescript/1.10.844/clojurescript-1.10.844.jar
- …
This list can often get very long, but it is managed for you by the tool and your config so you don’t really need to worry about the fine details.
When translating a resource name to an actual filename it’ll just go over this list and stop when it finds a match. So we want to find the clojure/string.cljs
resource the JVM will first check
src/dev/clojure/string.cljs
src/main/clojure/string.cljs
src/test/clojure/string.cljs
They all don’t exist, so it just keeps going, one by one in order. When the classpath entry is a .jar
file it’ll look into that file to see if that contains the resource it is looking for. Eventually it’ll arrive at the clojurescript-1.10.844.jar
and find the file it was looking for. I simplified here a little since the CLJS compiler will actually look for two files, it’ll first try to find the clojure/string.cljs
and if it doesn’t find that it’ll look for clojure/string.cljc
. Clojure will first look for .clj
files and then for .cljc
as well.
Since it traverses the classpath in order it is important to choose a unique namespace prefix as your code may otherwise collide with one of your dependencies. The common convention from the JVM world is using the reverse domain notation so foo.company.com
becomes the com/company/foo
resource path and com.company.foo
namespace prefix.
Those rules then also tell you where to put your source code. The default convention would be to put my.awesome.app
into src/main/my/awesome/app.cljs
. We often specify multiple :source-paths
to separate out development or test-only code from the actual sources of our application. This is not strictly necessary, and you could instead put it all into one source path, but it can make the project setup slightly cleaner.
The important bit is that the resource name must be found exactly on the classpath. A common mistake is setting the wrong level of the source path, say you put the actual file src/main/my/awesome/app.cljs
but configure {:source-paths ["src"]}
. Following the rules this will only end up looking for src/my/awesome/app.cljs
and thus never find your actual file.
Because of how the JVM works the classpath can currently only be configured once on startup and as such changing :dependencies
or :source-paths
will require a restart of the shadow-cljs
, lein
or clj
process.
Output Paths and HTTP
The Classpath controls everything related to the “inputs” used for your programs. Locating source files and other additional resources. In CLJ you can access it at runtime but for CLJS it is only relevant during compilation but not at runtime.
A common issue many CLJS devs run into is how you access the files in a HTTP context. I’ll be using shadow-cljs
with the built-in :dev-http
as an example, but the same things really apply to all setups.
Browsers are really picky for file security reasons and generally refuse to run certain code if you just load it from disk directly. Therefore, you’ll need a HTTP server to actually make use of the files generated by a shadow-cljs
build.
The most basic build config for a :browser
build looks is this:
{...
:dev-http
{3000 "public"}
:builds
{:app
{:target :browser
:modules {:main {:init-fn my.awesome.app/init}}
:output-dir "public/js"
:asset-path "/js"
}}}
The :output-dir
and :asset-path
values are actually the default so you could even omit those. For example purposes I added them.
Dissecting this we have a couple paths. The Classpath we covered so I omitted :source-paths
and :dependencies
. All of the paths left are related to the output.
The first relevant option is the :output-dir
of "public/js"
. This tells shadow-cljs
to put all files it generates into that directory. The :modules
:main
key controls how the file is called. So this will generate the public/js/main.js
. Each additional configured module would just generate an additional file in the :output-dir
. This is relevant if you want to do more advanced code-splitting setups.
The next option is the :dev-http {3000 "public"}
, which instructs shadow-cljs
to start a HTTP server on port 3000
serving the public
“root” directory. This will make all files in this directory available over http://localhost:3000
. When you request that the Browser will actually request http://localhost:3000/
since there must always be a path in the URL. The URL is constructed of several standardized pieces starting with the scheme http:
then the host
localhost
and the port
3000
and a path of /
.
Since /
is not a valid filename you could create in a directory the convention is for the server to look for an index.html
file instead when it receives a request ending with a /
. Custom servers are in full control over this so this does not apply to all server, but it does for :dev-http
.
Assume that HTML file contains a <script src="/js/main.js">
. Since that only specified the path without a new scheme
or host
it’ll perform the request reusing parts from the initial request to http://localhost:3000/
making it http://localhost:3000/js/main.js
. Since that is an actual filename the server will just look for it in its root directory and will end up giving you the content of public/js/main.js
.
Basically you cut out the host portion and prepend the :dev-http
root to select the file you’ll actually get
1. HTTP URL
http://localhost:3000/js/main.js
2. cut the scheme and host:port
[http://localhost:3000]/js/main.js
3. prepend the :dev-http root
public/js/main.js
:dev-http
actually allows specifying multiple roots as well as using the classpath. :dev-http {3000 ["foo" "bar" "classpath:public"]}
would first look for foo/js/main.js
, then try bar/js/main.js
and then try to find the public/js/main.js
resource on the actual classpath (including in the .jar
files).
The :asset-path
becomes important when the generated code needs to locate additional files to load at runtime. It should always be the bit of path that will need to be added to the generated module filename (eg. main.js
). The final constructed path should be directly loadable in your browser, eg. http://localhost:<port>/<asset-path>/<module>.js
.
At runtime on the client side the :asset-path
is actually just treated as a prefix so you can specify a full URL if you actually host the JS code on another server (eg. :asset-path "http://some.cdn/with/a-nested/path"
).
Setting an incorrect :asset-path
may work since it is only relevant when loading files dynamically at runtime. release
builds may not actually be doing this but watch
builds often do and having an incorrect path may lead to “file not found” request errors (eg. for source maps).