Skip to content
This repository has been archived by the owner on Sep 2, 2023. It is now read-only.

Proposals for configuring parse goals of files in --experimental-modules #160

Closed
wants to merge 2 commits into from

Conversation

GeoffreyBooth
Copy link
Member

@GeoffreyBooth GeoffreyBooth commented Jul 28, 2018

Following up from #150, this thread is for determining the user experience of an addition to package.json that configures Node to treat .js files as ESM within a package boundary (i.e. with the “module” parse goal). This assumes the --experimental-modules implementation as a baseline; other implementations might not have this issue, for example if they don’t allow importing ESM via import statements. For suggestions of other implementations, can I invite people to open their own threads for alternate proposals? And let’s keep this one focused on what ESM in .js in --experimental-modules should look like.

Readable view: /~https://github.com/GeoffreyBooth/modules/blob/esm-in-js-ux/proposals/esm-in-js-ux.md

To discuss particular proposals, please add comments as PR notes after those proposals. Add general comments to suggest new proposals or broader changes.


Update: we have reached consensus! Of the people who have commented on this thread, at least. Here’s the consensus proposal:

MIME types/webserver as metaphor, defined in package.json and/or in linked JSON files

In browsers, users must configure their webservers to serve .js or .mjs files with a JavaScript MIME type like text/javascript in order for the browser to recognize the file as ESM. The browsers pay no attention to file extensions, but most webservers use file extensions to determine what MIME types to use to serve files. This is similar to AddType video/webm .webm that you may have seen in Apache webserver configuration files.

The idea here is to mimic this webserver configuration in a new package.json section called mimes:

"mimes": {
  "js": "application/node",
  "mjs": "text/javascript",
  "json": "application/json"
}

In this example, .js files would be treated as CommonJS, which has the MIME type application/node. .mjs files would be treated as ESM (text/javascript, or application/javascript) and .json files would be treated as JSON. The configuration in this example would be the default, what Node uses if a mimes field was missing, as historically .js files were always treated as CommonJS.

Here’s another example:

"mimes": {
  "js": "text/javascript",
  "cjs": "application/node"
}

This tells Node to treat .js files within this package boundary as ESM, and a new .cjs file extension as CommonJS. The .cjs extension could be anything—just as in Apache a user could add the configuration AddType video/webm .foo, this mimes block provides the flexibility for new file extensions to be defined and mapped to MIME types that Node understands. Since .mjs isn’t listed here, Node falls back to the default mapping for it (text/javascript), and likewise for any other file extensions that Node recognizes by default (.json, etc.).

That mimes section could alternatively take an array of strings, which would be relative or resolved paths to JSON files:

"mimes": [
  "./my-mimes.json",
  "typescript/mimes.json"
]

Each JSON file would be an object with “extension: mime” mappings like the first example. They would be combined using Object.assign. If the first element in the array is null, Node’s default MIME mappings are discarded first before the array elements are merged, e.g. something like:

const nodeDefaultMimeMappings = require('module').mimeMappings;

const packageJsonMimes = require('./package.json').mimes;
if (Array.isArray(packageJsonMimes)) {
  let mimeMappings = (packageJsonMimes[0] === null) ? Object.create(null) : nodeDefaultMimeMappings;
  packageJsonMimes.forEach((filePath, index) => {
    if (filePath === null && index !== 0) {
      throw new Error('Only the first element of a mimes array may be null');
    }
    Object.assign(mimeMappings, require(filePath));
  });
  return mimeMappings;
} else {
  return Object.assign(nodeDefaultMimeMappings, packageJsonMimes);
}

Through this mimes section that can take either an object or an array of strings that reference JSON files (of mimes mapping objects), users can configure Node’s handling of file extensions.

ljharb

This comment was marked as off-topic.

ljharb

This comment was marked as off-topic.

ljharb

This comment was marked as off-topic.

GeoffreyBooth

This comment was marked as off-topic.

@michael-ciniawsky
Copy link

michael-ciniawsky commented Aug 1, 2018

**across systems **

:) Since node doesn't support import "https://..." (HTTP) atm and a particular node application normally runs on one/the same system using the file system instead of HTTP transfer, mapping the extension to/sniffing/parsing the MIME is just an extra layer in most cases. Hence I don't (yet) see the benefits for node in having an across system compatible format for it's module system in general (but that's separate discussion).

pkg.mime

By in this context I'm especially referring to (npm) packages which are neither directly consumed via STDIN nor as a data: URL (those use cases could use MIME parsing, maybe even passing a Data URL to STDIN node --eval 'data:<type>[;<encoding>],import x from 'pkg' potentially solves some STDIN related issues 🤔, but that's aswell a separate issue/feature/use case imho), so why would using MIME be a preferable approach for packages compared to e.g index.js/index.mjs, pkg.main/pkg.module or pkg.mode (particularly in terms of setting the Parse Goal (CJS/ESM)?
e.g

package.json

"mimes": {
   "text/javascript": [ ".mjs" , ".js" , ".coffee", ... ],  // Means ESM (?)
   "application/node": [ ".js", ".coffee" ]                 // Means CJS (?) 
},

What happens here in case someone specifies .js, .coffee for 2+ MIME Type mappings?

package.json

"mimes": {
   ".mjs": "application/node" // (CJS ?)
   ".wasm": "text/javascript"  
}

What happens if someone specifies 'incorrect/confusing' mappings ?

Loaders

In addition we have an ordering problem, in your example you have paired format declaration and handling in the same chain of delegation. These could have separate orderings and are not tightly coupled in nature.

Please clearify, I'm sorry but especially "chain of delegation" sounds really vague for someone missing the context on potential implementations for loaders

This has a problem for tools that do not ship with a JS VM. Even if loaders require a JS VM to handle resources, declaration of resource formats are able to be defined in a way that they can be statically detected. Routing to an appropriate pipeline for a given format without requiring a JS VM would be ideal.

Same to vague and abstract for the average human being 🙃, I don't understand why/how a loader for node should work without a JS VM and what other 'appropiated pipelines' you have in mind here (in particular) other then writing a node loader?

This just looks like an encoding similar to a MIME, however it doesn't work with things like data: URLs or HTTP servers.

Sure, but data: URL and HTTP handling are not needed here?

@devsnek
Copy link
Member

devsnek commented Aug 2, 2018

the thing here seems to be people can't agree on how node will key module types.

i would argue that we shouldn't be focusing on where the map of types goes (a resolve hook could look in a package.json, we don't not need to implement that behaviour) but rather what system we want to use to key the types.

personally i am in favour of mimes. if node itself is as verbose as possible, userland hooks can narrow down, but you can't go the other direction.

@bmeck
Copy link
Member

bmeck commented Aug 2, 2018

so why would using MIME be a preferable approach for packages compared to e.g index.js/index.mjs, pkg.main/pkg.module or pkg.mode (particularly in terms of setting the Parse Goal (CJS/ESM)?

education and consistency.

these have been discussed in many issues and I'm not really wanting to review every API design you post in these threads but can if given enough time.

What happens here in case someone specifies .js, .coffee for 2+ MIME Type mappings?

We should not use the configuration encoding you have here. In particular it needs to be 1 file extension to 1 format. The usage of arrays in your JSON simply leads to conflicts. Just an API design problem where you should have used a 1-1 dictionary. Since multiple file extensions can map to the same MIME, it needs to be keyed off the file extension.

What happens if someone specifies 'incorrect/confusing' mappings ?

I'm not sure what this means.

  • Invalid MIME string as in it doesn't parse (e.g. js)?

I presume it would fail once if it gets routed to format disambiguation in the default loader. Note that userland loaders can supercede their parents, and may still allow those files to work either by not delegating to the default loader or by ignoring any errors.

  • Incorrect by typo etc. (e.g. text/js)

The default loader would return text/js for the format since it parses as a valid MIME. Once it is done going through loaders, Node wouldn't know how to make a Module Record for that format since it is not well known and it would fail there.

For any given format mechanism I would expect similar cases for unparsable values and values that are not known how to turn into Module Records.

Please clearify, I'm sorry but especially "chain of delegation" sounds really vague for someone missing the context on potential implementations for loaders

Loaders have a well defined order in which they can call each other in various designs. Consider a more concrete case where we have 3 userland loaders:

node --loader jsx --loader apm --loader code-coverage test.js

code-coverage mutates the resolution first, then it could delegate up to jsx, then jsx could delegate to apm, which could then delegate to node (the default loader).

By tightly coupling formats to resolution semantics we get into an odd situation: which of these loaders formats is the proper one for node to return from resolve(). In this situation we can somewhat safely say .js come from the jsx loader and might even be a JSX format. However, what happens if our APM supports JSX:

node --loader apm --loader jsx --loader code-coverage test.jsx

How does apm avoid specifing that .js files are JSX but still support them?

This is increasingly complicated if people mix usage of NODE_OPTIONS and CLI flags since NODE_OPTIONS always occurs ahead of CLI flags in our loading process.

Essentially, I'm saying we shouldn't try to tie these configuration mechanisms to the same composition as multiple loaders.

Same to vague and abstract for the average human being 🙃, I don't understand why/how a loader for node should work without a JS VM and what other 'appropiated pipelines' you have in mind here (in particular) other then writing a node loader?

A runtime loader doesn't need to work without a JS VM, but other preprocessing tools are in our ecosystem as well. It would be ideal if we could support them as long as we don't burden our own implementations. I don't think Node should become a transpiler or anything, but we shouldn't make things more difficult for them if avoidable.

Imagine a costly transform like webpack, prepack, or uglify. Being able to know the format of a given file in the same way as other tools / node would allow unification rather than repeated and different ways of specifying things like sourceType.

Sure, but data: URL and HTTP handling are not needed here?

Consistency matters for both education and allowing systems to avoid creating conflict with others. If we have multiple ways of defining the format of a given resource, that means that we need to keep those multiple format disambiguation methods in sync and keep parity in features between them. There is no clear reason to encourage having multiple different ways to describe the format of a resource, but there are reasons to encourage having a single way to describe the format.

@michael-ciniawsky
Copy link

michael-ciniawsky commented Aug 3, 2018

education and consistency.

Some kind of education for the education sake. I'm arguing that the what to educate is fairly important here and simply disagree with the currently proposed contents of this proposal, since there are imho 'better' (simpler, more explicit) ways available

Consistency (see below *)

these have been discussed in many issues and I'm not really wanting to review every API design you post in these threads but can if given enough time.
Sry, but nobody forces you to do so, as nobody forces me or anyone else to review your proposals... I don't see the argument here, you are the one who needs to take responsibility for your actions

Invalid MIME string as in it doesn't parse (e.g. js)?

*Confusing as it is theoretically possible to override any extname with a custom MIME Type via package.json which is not needed as the main issue is still how to deteremine the Parse Goal for .js files and mainly only for them (atm) (e.g what would be other cases?). MIME also doesn't directly provide information about a parse goal for a certain Content-type, so assuming/using application/node (CJS) and text/javascript (ESM), especially the latter is not correct (text/javascript;goal=module would be, but that doesn't exists (yet))

<!-- main.js { 'Content-type': 'text/javascript' } -->
<script src="/main.js"></script>
<!-- 
  module.js { 'Content-type': 'text/javascript' } 
  + the module attribute is mandatory to determine the parse goal, 
  MIME isn't envolved in determining the parse goal here
-->
<script src="/module.js" type="module"></script>
{
   name: 'pkg'
   main: 'src/',
   mime: {
      // As I currently understand it, this assumes the MIME Type 'text/javascript' === isModule,
      // which is not defined in the MIME Standard and therefore incorrect and 
      // would be a node specific interpretation for 'text/javascript' 
      // being inconsistent with other host like e.g the browser
      '.mjs': 'text/javascript',
      // This would be correct, but doesn't exist
      '.mjs': 'text/javascript;goal=module'
      '.js': 'application/node'
   }
}

{
   name: 'pkg',
   main: 'dist/', // dist/index.ext => { type: {MIME}, module: false }
   module: 'src/', // src/index.ext => { type: {MIME}, module: true }
}

Plus pkg.mime has the same limitations then pkg.mode adding a 'package boundary' which makes it impossible to provide a 'dual-mode' package without having to rely on different file extensions 🤷‍♂️

By tightly coupling formats to resolution semantics we get into an odd situation: which of these loaders formats is the proper one for node to return from resolve(). In this situation we can somewhat safely say .js come from the jsx loader and might even be a JSX format. However, what happens if our APM supports JSX:

The ordering problem is a common problem for any plugin/loader system. At some point one simply has to know when a particular plugin/loader needs to be executed [in relation to other plugins/loaders]. This can only be mitigated and due to that fact it is always the fundamental question if introducing such a system solves more issues then the intrinsic complexity adds

// Nesting within @import already doesn't work
// and the complexity of transforming CSS in a certain way  
// is magnitudes lower compared to what 'certain' could mean for JS
postcss([ nested, imports ])

A runtime loader doesn't need to work without a JS VM, but other preprocessing tools are in our ecosystem as well. It would be ideal if we could support them as long as we don't burden our own implementations. I don't think Node should become a transpiler or anything, but we shouldn't make things more difficult for them if avoidable.

In the context of node loaders, this would still require a node loader which e.g spawns a child_process to execute and handle a non-JS tool etc? Integration needs to happen somewhere, otherwise it is an enterily separate process

Consistency matters for both education and allowing systems to avoid creating conflict with others. If we have multiple ways of defining the format of a given resource, that means that we need to keep those multiple format disambiguation methods in sync and keep parity in features between them. There is no clear reason to encourage having multiple different ways to describe the format of a resource, but there are reasons to encourage having a single way to describe the format.

Yes, but in the context of that example, there would nevertheless be no need to educate anyone about data: URLs and HTTP handling, because it is not relevant there

@ljharb
Copy link
Member

ljharb commented Aug 3, 2018

You need to know the parse goal for wasm too, and for any future module type. It’s not just JS.

@michael-ciniawsky
Copy link

hmmm, but for WASM it would definitely not work then, since there is only application/wasm (to my knowledge) but WASM Modules aren't spec'd yet. It's either a good argument to make changes to the MIME spec to add something in the direction of goal=module and/to give it some traction or otherwise, not to rely on MIME for parse goals at all given the current status quo

@ljharb
Copy link
Member

ljharb commented Aug 3, 2018

Or we could rely directly on extension, since .wasm is pretty clear-cut.

@bmeck
Copy link
Member

bmeck commented Aug 7, 2018

@michael-ciniawsky

Some kind of education for the education sake. I'm arguing that the what to educate is fairly important here and simply disagree with the currently proposed contents of this proposal, since there are imho 'better' (simpler, more explicit) ways available

I don't see them being explained? Which 'better' ways are you talking about, and how do they fix the problems I've listed in the thread above. In particular, how are they going to keep parity when all other designs are being done using MIMEs and APIs are even using MIMEs (e.g. data: and Blob usage).

*Confusing as it is theoretically possible to override any extname with a custom MIME Type via package.json which is not needed as the main issue is still how to deteremine the Parse Goal for .js files and mainly only for them (atm) (e.g what would be other cases?). MIME also doesn't directly provide information about a parse goal for a certain Content-type, so assuming/using application/node (CJS) and text/javascript (ESM), especially the latter is not correct (text/javascript;goal=module would be, but that doesn't exists (yet))

I think you are getting a bit confused, the Script goal of JS is never checked for. You can serve <script type=text/javascript src=index.html> and it simply ignores the MIME. MIME is not used in classic script loading per the WHATWG spec except to use the charset parameter that exists in all MIMEs. Currently the JS MIME only ever refers to the Module goal of JS in terms of standards for web browsers.

Plus pkg.mime has the same limitations then pkg.mode adding a 'package boundary' which makes it impossible to provide a 'dual-mode' package without having to rely on different file extensions 🤷‍♂️

This is a limitation that seems fine to me as most people seem to be able to work only in one format or are able to use multiple file extensions? Use a per package "loader": for more complex behaviors and routing of files?

The ordering problem is a common problem for any plugin/loader system. At some point one simply has to know when a particular plugin/loader needs to be executed [in relation to other plugins/loaders]. This can only be mitigated and due to that fact it is always the fundamental question if introducing such a system solves more issues then the intrinsic complexity adds

I don't understand if this is for or against keeping the configuration points decoupled. It seems like it is neither for or against keeping things coupled so I'm just going to state that it seems simple to keep them decoupled and means you don't have to execute JS to determine such things.

In the context of node loaders, this would still require a node loader which e.g spawns a child_process to execute and handle a non-JS tool etc? Integration needs to happen somewhere, otherwise it is an enterily separate process

Yes, but this is not necessarily being done when the application runs. Loaders can be run ahead of time such as if someone wants to do things like a transpiler loader or resolve all URLs to absolute URLs ahead of running an application. The claim here is that there isn't really any advantage to

hmmm, but for WASM it would definitely not work then, since there is only application/wasm (to my knowledge) but WASM Modules aren't spec'd yet. It's either a good argument to make changes to the MIME spec to add something in the direction of goal=module and/to give it some traction or otherwise, not to rely on MIME for parse goals at all given the current status quo

WASM modules are first class already in terms of language design. There doesn't need to be any special MIME for them so I'm a bit confused at this comment. All that is being discussed is how to properly make facades for them to integrate with ESM since the current Abstract Module Record and Source Text Module Record constraints do not satisfy the needs of WASM integration.

@bmeck
Copy link
Member

bmeck commented Aug 10, 2018

Just to clarify to everyone. It was always my intention in my gist mentioned for mappings that such databases not only be built into Node but also to be configurable by users and tools. I'd be fine if there was both the [...databasePaths] that looked up a resource and allowed composition/delegation and an alternative that allowed an in-band {"extension": "format"} form. Is there anyone thinking that we should limit ourselves to just one of these 2? The [] form from my gist that looks up paths to databases/includes a few reserved ones is my preferred one and others have a preference for the {} form, but both should be able to be statically analyzed by tools or by human lookup.

If we could amend the proposal to perhaps have both forms of this as I don't think they conflict with each other?

We should probably document removing existing mappings from being delegated to like {".node": null} or w/e for the {} form as well to help keep some parity if that sounds ok as well.

@GeoffreyBooth
Copy link
Member Author

GeoffreyBooth commented Aug 10, 2018

I’m trying to clarify @bmeck’s proposal and how it allows user-specified mappings. How about something like this:

Assume a new mimes section in package.json that can be an object, e.g.:

"mimes": {
  "js": "text/javascript",
  "cjs": "application/node"
}

That mimes section could alternatively take an array of strings, which would be relative paths to JSON files:

"mimes": [
  "./jsx.json",
  "./typescript.json"
]

Each JSON file would be an object with “extension: mime” mappings like the first example. They would be combined using Object.assign, e.g. something like:

const packageJsonMimes = JSON.parse(fs.readFileSync('./package.json')).mimes;
let mimes = {};
if (Array.isArray(packageJsonMimes) {
  packageJsonMimes.forEach((filePath) => {
    const mimeMappings = JSON.parse(fs.readFileSync(filePath));
    Object.assign(mimes, mimeMappings);
  }
} else {
  mimes = packageJsonMimes;
}
return mimes;

And that’s it. The idea is that not just Node but also build tools need to be able to parse this mimes section, and so it can’t be terribly complicated. We can’t have "mimes": ["typescript/mimes.json"], I don’t think, because then a build tool would need to be able to implement Node’s package name resolution algorithm in order to find that mimes.json file. That wouldn’t be difficult for Node to do, of course, but it would be a challenge for a build tool—especially one not written in JavaScript—to achieve. I suppose the same could more or less be achieved by "mimes": ["./node_modules/typescript/mimes.json"]; the configuration would be pulled in from outside the package boundary, though the effects would apply only within the importing package.

@bmeck is this a fair characterization of what your proposal would look like in practice? Everyone else, is the utility of defining these mappings in external files worth the additional complexity?

@ljharb
Copy link
Member

ljharb commented Aug 10, 2018

@GeoffreyBooth it wouldn't be challenging at all; every build tool already uses something like https://npmjs.com/resolve to do just that. More to the point, if it allowed any file path whatsoever, it'd have to be a require path imo, otherwise it would make no sense.

@bmeck
Copy link
Member

bmeck commented Aug 10, 2018

@GeoffreyBooth

@bmeck is this a fair characterization of what your proposal would look like in practice? Everyone else, is the utility of defining these mappings in external files worth the additional complexity?

It also included the idea of removing mappings and a null database to ensure no default mappings are applied, so some things would change slightly, to be more composed like:

const packageJsonMimes = JSON.parse(fs.readFileSync('./package.json')).mimes;
let mimes = Object.create(null);
if (Array.isArray(packageJsonMimes) {
  for (const filePath of packageJsonMimes) {
     if (filePath === null) break;
      const mimeMappings = JSON.parse(fs.readFileSync(filePath));
     Object.assign(mimes, mimeMappings);
  }
} else {
  mimes = packageJsonMimes;
}
return mimes;

We can’t have "mimes": ["typescript/mimes.json"], I don’t think, because then a build tool would need to be able to implement Node’s package name resolution algorithm in order to find that mimes.json file.

I'm not sure how valuable delegation is if it cannot be in a shared package. "./node_modules/typescript/mimes.json" etc. are fragile and lead to other systems like meteor rewriting that kind of import/require when they encounter it. Since typescript can be above the current package on disk even if it is a dependency.

A lot of places already have the node resolution algorithm in place, Java had it for Closure Compiler/Ocaml for Flow/JS via the lib @ljharb mentioned/etc. have libraries and if package name maps were introduced those tools would need to implement that resolution algorithm as well.

If it only is for relative paths I don't think it is valuable enough to warrant being shipped since it would lead to people using those fragile paths instead of encouraging tools to use libraries that implement the node resolution algorithm that already exist for various languages.

@GeoffreyBooth
Copy link
Member Author

Since typescript can be above the current package on disk even if it is a dependency.

This presents a problem though. If typescript is some globally installed package, then the behavior of the current package varies based on the environment it’s in. Like if I have TypeScript 2.0.0 installed globally and it has some MIME mappings defined, but then my colleague has TypeScript 3.0.0 installed globally and 3.0.0 ships with different MIME mappings, then the package with "mimes": ["typescript/mimes.json"] in it behaves differently on my machine from my colleague’s.

@bmeck
Copy link
Member

bmeck commented Aug 10, 2018

If typescript is some globally installed package, then the behavior of the current package varies based on the environment it’s in.

Globally installed packages are not part of resolution.

@GeoffreyBooth
Copy link
Member Author

Globally installed packages are not part of resolution.

Then what do you mean by “typescript can be above the current package”?

@ljharb
Copy link
Member

ljharb commented Aug 10, 2018

@GeoffreyBooth something like a node_modules like this:

foo
  |----- typescript
  |----- bar
         |--- typescript
         |--- baz

where both foo and baz do require('typescript')

@GeoffreyBooth
Copy link
Member Author

Okay, if you’re telling me that there won’t be resolution issues, like for something like the NPM website trying to read the package to list stats about it on its webpage, then sure.

@bmeck what’s the use case for null? Why would we want to start from a blank slate rather than Node’s defaults?

@bmeck
Copy link
Member

bmeck commented Aug 10, 2018

@GeoffreyBooth main use case is disabling formats that node would have by default. JSON, C++, and CJS are not importable on the web natively, so setting them to be errors/not found would ensure they are not used in a package.


### [nodejs/node/pull/18392](/~https://github.com/nodejs/node/pull/18392)

This pull request adds support for a `"mode": "esm"` flag to be added to `package.json`, that tells Node to treat all `.js` files within the package boundary of that `package.json` file as ESM. The package boundary of a `package.json` file is deemed to be all modules in that folder and subfolders until the next nested `package.json` file.
Copy link
Contributor

@jkrems jkrems Nov 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this imply that every import './x/y/z.js' would necessarily trigger a search for package.json files along the path hierarchy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect so, that's how the current require implementation, and the ecosystem, works.

@MylesBorins
Copy link
Contributor

Is this something we want to continue to try and land or should we close it?

1 similar comment
@MylesBorins
Copy link
Contributor

Is this something we want to continue to try and land or should we close it?

@GeoffreyBooth
Copy link
Member Author

I think let’s please keep it open for now? We’re referring to the "mimes" proposal in the Stage 2 proposals, and it could very well end up getting implemented in the new modules implementation too. This may have been written in the context of --experimental-modules, but configurability of Node’s treatment of file extensions is potentially useful in any implementation.

@Alhadis
Copy link

Alhadis commented Dec 17, 2018

Just drifting through to applaud those who worked on this solution. This is easily the most well-rounded and elegant solution I've read to date. Wouldn't have thought to define MIME types in package.json — it's clean, it works, and best of all, it breaks nothing. 👍

Just one question: will it be possible to declare an optional character encoding in the same field? E.g.:

"mimes": {
	"js": "application/node;charset= Shift_JIS"
}

If not, then perhaps it'd be more "correct" to use "mimeTypes" instead of "mimes", as it eliminates any potential for ambiguity — MIME itself is a broad term which and covers quite a few things, only one of which is content/media-type.

Otherwise, great work!

@bmeck
Copy link
Member

bmeck commented Dec 17, 2018

@Alhadis MIME includes parameters since its first RFC. However, note that compliant usage of MIME requires that unrecognized parameter names be ignored. For charset it is likely we will parse it, but if we comply with browsers for ESM it must be UTF-8 per https://html.spec.whatwg.org/multipage/webappapis.html#fetch-a-single-module-script which ignores the charset and decodes as UTF-8 always (and the same for all other script types).

@Alhadis
Copy link

Alhadis commented Dec 17, 2018

That's all well and good for ESM, but when authors start plugging their own loaders and formats into the module system, it's harder to guarantee authors will always be using UTF-8... 😞

@devsnek
Copy link
Member

devsnek commented Dec 17, 2018

Most of the JS ecosystem assumes UTF-8. in node you should be able to use buffers and text decoders to handle the difference.

@bmeck
Copy link
Member

bmeck commented Dec 17, 2018

@Alhadis as @devsnek mentions, any custom loaders can convert to supported formats such as JIS conversion to UTF-8 before handing it off to the runtime.

@Alhadis
Copy link

Alhadis commented Dec 17, 2018

Ah that's different. Sorry, I thought the MIME mapping was attempting to address custom loaders as well. My bad.

@bmeck
Copy link
Member

bmeck commented Dec 17, 2018

@Alhadis custom loaders should be able to query these, but for a variety of scenarios we cannot tie them together. E.g. if a package supports coffeescript and iced-coffeescript generates coffeescript you encounter a very good example scenario:

  1. Package is-iced could list the MIME for iced-coffeescript.
  2. A custom loader could generate coffeescript from those iced-coffeescript files.
  3. A custom loader generates javascript from those coffeescript outputs.
  4. The runtime knows how to load javascript and does the appropriate steps.

We have nested compilation in that scenario and it shows why you need loaders to generate the appropriate formats that the runtime supports by default. If you hand node a coffeescript file, it doesn't know what to do with it, it needs that loader to convert it to something that it understands. The same is true for character encodings.

@GeoffreyBooth
Copy link
Member Author

Wow, an Iced CoffeeScript reference. @bmeck you warm my heart.

@Alhadis This proposal is based on the currently-shipping --experimental-modules implementation, and we’re building a new implementation with a different design over in the ecmascript-modules repo. A road map of its development is here. As currently designed the new implementation doesn’t have a need for user configuration of how Node handles file extensions, at least for treating .js files as ES modules, but this proposal is still here in case we decide to add such functionality in either implementation at some point.

@Alhadis
Copy link

Alhadis commented Dec 18, 2018

building a new implementation with a different design over in the ecmascript-modules repo

Ah, crap, I'd thought it'd installed because I've been monitoring the ESM API docs for changes, and never assumed there was progress because, well, the original EP link got archived but it was always there, so... 😅

I'm working on a library that's probably following this wayyyy closer than it should be, so... 😀 Consider that new repo

@Fishrock123
Copy link
Contributor

My 2c: mode (with common named modes) is incredibly more ergonomic to average user understanding than mimes, and I would prefer mode like that any day. (Even if it solves all problems slightly less well.)

@GeoffreyBooth
Copy link
Member Author

One other thought: we could even have mode and mimes, to provide the user with lots and lots of control, if we can work out how the two would interact. This almost starts to turn package.json into a loader via configuration, but that’s not necessarily a bad thing.

@MylesBorins
Copy link
Contributor

MylesBorins commented Jan 16, 2019

soooo an idea. What if "loaders" and "mode" were the same thing.... we have a number of built in loaders that are document e.g esm, cjs, json, native. Then we could have a table such as

loaders: {
  ".js": "nodejs:cjs",
  ".esm": "esm",
  ".mjs": "nodejs:esm",
  ".wasm": "@loader-central/wasm",
  ".ts": "./loaders/typescript.js"
}

As you can see here we have a combination of built-in loaders nodejs:loader, loaders from modules esm, loaders from scopes @loader-central/wasm, and loaders from local files system ./loaders/typescript.js

Thoughts?

edit:

adding example from bradmeck for using an array to compose multiple loaders into a pipeline

loaders: {
  ".jsx": ["@babel/...", "nodejs:cjs"]
}

@devsnek
Copy link
Member

devsnek commented Jan 16, 2019

I like this direction, especially being able to specify loaders.

@ljharb
Copy link
Member

ljharb commented Jan 16, 2019

Except for the builtin module part being a url instead of a scope (but assuming that we’d stick with whatever node core decided), that looks great to me. (I’d expect all builtin loaders to be under some kind of loader/ path, as well)

@guybedford
Copy link
Contributor

I like the idea of the progressive enhancement, but would want to keep the string value as the Phase 2 approach, where basically the string value effectively determines exactly the full map without having to carefully specify each extension, which most users shouldn't need to do I don't think.

Also, are we sure we want to make it easy to publish uncompiled source formats to npm? The module translator buffet still feels like a slippery slope to me, that we might want to leave to loaders in general.

@devsnek
Copy link
Member

devsnek commented Jan 16, 2019

@guybedford you don't need to specify every combo, just the ones you want to override. additionally "mode": "esm" doesn't tell me anything about what it is doing. Is it making all files load as source text? is it making .esm files work? is it running my code with jdd's esm package? it just doesn't have good ergonomics.

@bmeck
Copy link
Member

bmeck commented Jan 16, 2019

@guybedford nothing is preventing people from requiring a global loader to compile their code properly to run. I've stated in the past being able to statically know your loaders is preferable as well which global loaders do not have since they are just coming from the CLI/ENV. I like that this approach does let people see what loaders are expected and allows tooling to do things with that knowledge statically.

I do have a list of features I would expect regarding multiple points of composition like the array example above.

  1. The ability to declare a specific format as "unhandled"/maps to an unknown format (browsers treat this as the "" type)
{
  "loaders": {".node": null}
}

Disabling default handling of ".node", return "" (or some well known signifier) as the expected format.

  1. The ability to intercept ALL files. Thinking of "loaders": as an Array would satisfy this where each item can map to arbitrary file extensions, or a "catch all".
{
  "loaders": [
    {".ts": "always-typescript", ".json": "always-json"},
    "always-cjs"
  ]
}
  1. The ability to use a short string to summarize some common usage. "loaders": "nodejs:default" for example might apply to multiple file extensions so it needs to be a catch all which seems to have some conflict with "nodejs:cjs".

  2. It should be easy to configure things without writing a bunch of JS yourself. This leads to lots of JS events and piping data around through loaders... I think this needs to be thought out more to minimize spinning up loaders just to manage format disambiguation based upon file extension.

{
  // what goes here so I don't have to spin up JS to just say it has a mime of `application/vnd.coffeescript`
  "loaders": {".coffee": ...} 
}

@devsnek
Copy link
Member

devsnek commented Feb 2, 2019

@GeoffreyBooth btw there is more down here based on some additional constraints which isn't reflected in the "consensus" section at the top.

@GeoffreyBooth
Copy link
Member Author

@devsnek Sure, I think at this point we would need to start a new thread or a new repo like one of the recent proposal repos. The consensus here was based on the currently-shipped --experimental-modules, and so a feature based on our new implementation might very well look different. Also per #160 (comment), we might want to design this after or in tandem with loaders.

But still, there was a lot of thought put into the discussion here, so this should be at least a starting point I think for whatever we continue working on.

@GeoffreyBooth
Copy link
Member Author

Closing as this was based on the old --experimental-modules and there are new issues and PRs created since this discussion ended.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.