npm Blog (Archive)

The npm blog has been discontinued.

Updates from the npm team are now published on the GitHub Blog and the GitHub Changelog.

dealing with problematic dependencies in a restricted network environment

When using npm Enterprise, we sometimes encounter public packages in our private registry that need to fetch resources from the public internet when being installed by a client via npm install.

Unfortunately, this poses a problem for developers who work in an environment with limited or no access to the public internet.

Let’s take a look at some of the more common types of problems in this area and talk about ways we can work around them.

Note that these problems are not specific to npm Enterprise — but to using certain public packages in any limited-access environment. That being said, there are some things that npm (as an organization and software vendor) can do to better prevent or handle some of these problems. We’re still working to make these improvements.

Diagnosing the Problem

Typically, developers will discover the problem when installing packages from their private registry. When this happens, we need to determine the type of problem it is and where in the dependency graph the problematic dependency resides.

Problem Types

Here are some common problem types:

  1. Git repo dependency

    This is when a package dependency is listed in a package.json file with a reference to a Git repository instead of with a semver version range. Typically these point to a particular branch or revision in a public GitHub or Bitbucket repository. They are mainly used when the package contents have not been published to the public npm registry.

    When the npm client encounters these, it attempts to fetch the package from the Git repository directly, which is a problem for folks who do not have network access to the repository.

  2. Shrinkwrapped package

    This is when the internal contents of a package contain an npm-shrinkwrap.json file that lists a specific version and URL to use for each mentioned package from the dependency tree.

    During a normal npm install, the npm client attempts to fetch the dependencies listed in npm-shrinkwrap.json directly from the URLs contained in the file. This poses a problem when the client installing the shrinkwrapped package does not have access to the URLs that the shrinkwrap author has access to.

  3. Package with install script or node-gyp dependency

    This is when a package attempts to defer some setup process until the package is installed, using a script defined in package.json, which typically involves building platform-specific binaries or Node add-ons on the client’s machine.

    On a typical install, the npm client will find and run these scripts in order to automatically fetch and build the required resources, targeting the platform that the client is running on. But when limited internet access means the necessary resources cannot be fetched, the install will fail. Most likely the package will be unusable until the end result of running the install script on the client’s machine is achieved.

Direct vs Transitive

To determine the location of the problematic dependency, we can boil it down to two categories:

  1. Direct dependency

    A direct dependency is one that is explicitly listed in your own package.json file — a dependency that your project/package uses directly in code or in an npm run script.

  2. Transitive dependency

    A transitive dependency is one that is not explicitly listed in your own package.json file — a dependency that comes from anywhere in the tree of your direct dependencies’ dependencies.

Potential Solutions

The same way publishing a package to the public registry requires access to the public internet, most of these solutions require Internet access, at least on a temporary basis. Once the solution is in place, then access to public resources can be restricted.

For starters, remember that it’s generally a good idea to use the latest version of the npm client. To install or upgrade to the latest version, regardless of what version of Node you have installed, run npm i -g npm@latest (and make sure npm -v prints the version that was installed).

Let’s go over the problem types in more detail.

Replacing a Git dependency

Unfortunately, a dependency that references a Git repository (instead of a semver range for a published package) must be replaced with a published package. To do this, you’ll need to first publish the Git repository as a package to your npm Enterprise registry and then fork the project with the Git dependency and replace the dependency with the package you published. Then, publish the forked project, and use that package as a dependency (instead of the original).

It’s usually a good idea to open an issue on the project with the Git dependency, politely asking the maintainers to replace the Git dependency, if possible. Generally, we discourage using Git dependencies in package.json, and it’s typically only used temporarily while a maintainer waits for an upstream fix to be applied and published.

Example: let’s replace the "grunt-mocha-istanbul": "christian-bromann/grunt-mocha-istanbul" Git dependency defined in version 4.0.4 of the webdriverio package, assuming that webdriverio is a direct dependency and grunt-mocha-istanbul is a transitive dependency.

We’ll tackle this in two main steps: forking and publishing the transitive dependency, and forking and publishing the direct dependency.

Step 1: Fork-publish the transitive dependency

  1. Clone the project that is referenced by the Git dependency

    Optionally, you can create a remote fork first (e.g., in GitHub or Bitbucket) and then clone your fork locally. Otherwise, you can just clone/download the project directly from the remote repository. It’s a good idea to use source control so you can keep a history of your changes, but you could also probably get away with downloading and extracting the project contents.

    Example:

    git clone https://github.com/christian-bromann/grunt-mocha-istanbul.git
    
  2. Create a new branch to hold your customizations

    Again, this is so you can keep a history of your changes. It’s probably a good idea to include the current version of the package in the branch name, in case you need to repeat these steps when a later version is available.

    Example:

    cd grunt-mocha-istanbul
    git checkout -b myco-custom-3.0.1
    
  3. Add your scope to the package name in package.json

    In our example, change "grunt-mocha-istanbul" to "@myco/grunt-mocha-istanbul".

  4. Commit your changes to your branch and publish the scoped package to your npm Enterprise registry

    Assuming you have already configured npm to associate your scope to your private registry, publishing should be as simple as npm publish.

    Example:

    git add package.json
    git commit -m 'add @myco scope to package name'
    npm publish
    

Step 2: Fork-publish the direct dependency

  1. Clone the project’s source code locally

    Either create a remote fork first (e.g., in GitHub or Bitbucket) and clone your fork locally, or just clone/download the project directly from the original remote repository. It’s a good idea to use source control so you can keep a history of your changes.

    Example:

    git clone https://github.com/webdriverio/webdriverio.git
    
  2. Create a new branch to hold your customizations

    This is so you can keep a history of your changes. It’s probably a good idea to include the current version of the package in the branch name, in case you need to repeat these steps when a later version is available.

    Example:

    cd webdriverio
    git checkout -b myco-custom-4.0.4
    
  3. Add your scope to the package name in package.json

    In our example, change "webdriverio" to "@myco/webdriverio".

  4. Replace the Git dependency with the scoped package

    This means updating the reference in package.json, and it may mean updating require() or import statements too. You should basically do a find-and-replace, finding the unscoped package name and judiciously replacing it with the scoped package name.

    In our example, we only need to update the reference in package.json from "grunt-mocha-istanbul": "christian-bromann/grunt-mocha-istanbul" to "@myco/grunt-mocha-istanbul": "^3.0.1".

  5. Commit your changes to your branch and publish the scoped package to your npm Enterprise registry

    Assuming you’ve already configured npm to associate your scope to your private registry, publishing should be as simple as npm publish.

    In our example of webdriverio, we next need to deal with the shrinkwrap URLs before we can publish (handled below). In other scenarios, it may be possible to publish now.

    Example:

    git add .
    git commit -m 'replace git dep with scoped fork'
    npm publish
    
  6. Update your downstream project(s) to use the scoped package as a direct dependency (in package.json and in any require() statements)

    In our example, this basically means doing a find-and-replace to find references to webdriverio and judiciously replace them with @myco/webdriverio. However, webdriverio also contains an npm-shrinkwrap.json file. We’ll cover that in the next section.

Working with a shrinkwrapped package

It just so happens that our sample direct dependency above (webdriverio) also uses an npm-shrinkwrap.json file to pin certain dependencies to specific versions. Unfortunately the shrinkwrap file contains hardcoded URLs to the public registry. We need a way to either ignore or fix the URLs.

Shrinkwrap Option 1: Ignore the shrinkwrap

A quick workaround is to install packages using the --no-shrinkwrap flag. This will tell the npm client to ignore any shrinkwrap files it finds in the package dependency tree and, instead, install the dependencies from package.json in the normal fashion.

This is considered a workaround rather than a long-term solution: it’s possible that installing from package.json will install versions of dependencies that don’t exactly match the ones listed in npm-shrinkwrap.json, even though the versions of the package’s direct dependencies are guaranteed to be within the declared semver range.

Example:

npm install webdriverio --no-shrinkwrap

(As noted above, webdriverio@4.0.4 also has a Git dependency, so just ignoring the shrinkwrap isn’t quite enough for this package.)

Shrinkwrap Option 2: Fix the URLs

If you want to use the exact versions from the shrinkwrap file without using the URLs in it, you’ll have to use your own custom fork of the project that contains a modified shrinkwrap file.

Here’s the general idea:

(Note that steps 1-3 are identical to the fork-publish instructions for a direct dependency above. If you’ve already completed them, skip to step 4.)

  1. Clone the project’s source code locally

    Either create a remote fork first (e.g., in GitHub or Bitbucket) and clone your fork locally, or just clone/download the project directly from the original remote repository. It’s a good idea to use source control so you can keep a history of your changes.

    Example:

    git clone https://github.com/webdriverio/webdriverio.git
    
  2. Create a new branch to hold your customizations

    This is so you can keep a history of your changes. It’s probably a good idea to include the current version of the package in the branch name, in case you need to repeat these steps when a later version is available.

    Example:

    cd webdriverio
    git checkout -b myco-custom-4.0.4
    
  3. Add your scope to the package name in package.json

    In our example, change "webdriverio" to "@myco/webdriverio".

  4. Use rewrite-shrinkwrap-urls to modify npm-shrinkwrap.json, pointing the URLs to your npm Enterprise registry

    Unfortunately this is slightly more complicated than a find-and-replace, since the tarball URL structure of the public registry is different than the one used for an npm Enterprise private registry.

    In the example below, replace {your-registry} with the base URL of your private registry, e.g., https://npm-registry.myco.com or http://localhost:8080. The value you use should come from the Full URL of npm Enterprise registry setting in your Enterprise admin UI Settings page.

    Example:

    npm install -g rewrite-shrinkwrap-urls
    rewrite-shrinkwrap-urls -r {your-registry}
    git diff npm-shrinkwrap.json
    
  5. Commit your changes to your branch and publish the scoped package to your npm Enterprise registry

    Assuming you’ve already configured npm to associate your scope to your private registry, publishing should be as simple as npm publish.

    Be mindful of any prepublish or publish scripts that may be defined in package.json. You can try skipping those scripts when publishing via npm publish --ignore-scripts, but running the scripts may be necessary to put the package into a usable state, e.g., if source transpilation is required.

    Example:

    git add npm-shrinkwrap.json package.json
    git commit -m 'add @myco scope to package name' package.json
    git commit -m 'rewrite shrinkwrap urls' npm-shrinkwrap.json
    npm publish
    

    Note that a prepublish script will probably need to install the package’s dependencies in order to run. In this case, npm install will be executed first. If this happens, it should pull all dependencies in the shrinkwrap file from your registry. If any of those packages don’t yet exist in your registry, you’ll need either to enable the Read Through Cache setting in your Enterprise instance or to manually add the packages to the white-list by running npme add-package webdriverio from your server’s shell and answering Y at the prompt to add dependencies.

  6. Update your downstream project(s) to use the scoped package as a direct dependency (in package.json and in any require() statements)

    In our example, this basically means doing a find-and-replace to find references to webdriverio and judiciously replace them with @myco/webdriverio.

This is less than ideal, obviously. We’re currently considering ways to improve handling of shrinkwrapped packages on the server side, but a better solution is not yet available.

Packages with install scripts or node-gyp dependencies

Some packages want or need to run some script(s) on installation in order to build platform-specific dependencies or otherwise put the package into a usable state. This approach means that a package can be distributed as platform-independent source without having to prebundle binaries or provide multiple installation options.

Unfortunately this also means that these packages typically need access to the public internet in order to fetch required resources. In these cases, we can’t really do much to work around this approach, other than attempt to isolate the steps of fetching the package from the registry and set up the platform-specific resources it needs.

Ignore install scripts

As a quick first attempt, you can ignore lifecycle scripts when installing packages via npm install {pkg-name} --ignore-scripts.

Unfortunately, install scripts typically do some sort of platform-specific setup to make the package usable. Thus, you should review the install or postinstall scripts from the package’s package.json file and determine if you need to attempt to run them separately or somehow achieve the same result manually.

Set up node-gyp build toolchain manually

When node-gyp is involved in the setup process, the package requires platform-specific binaries to be built and plugged into the Node runtime on the client’s system. In order to build the binaries, the package will typically need to fetch source header files for the Node API.

The best we can do is attempt to setup the node-gyp build toolchain manually. This requires Python and a C/C++ compiler. You can read more about this at the following locations:

General installation: https://github.com/nodejs/node-gyp#installation

Windows issues: https://github.com/nodejs/node-gyp/issues/629

A good example of a package with a node-gyp dependency is node-sass.

Once the build toolchain is in place, the package’s install script may not need to fetch any external resources.

What’s next?

If you’ve made it all the way to the end, surely you’ll agree that npm could be handling things better to minimize challenges faced by folks with restricted internet access. We feel it’s in the community’s best interest to at least raise awareness of these problems and their potential workarounds until we can get a more robust solution in place.

If you have feedback or questions, as always, please don’t hesitate to let us know.