If you’ve ever been told you don’t exist by a software package, you…
- Might be a DevOps Engineer
- Might also start questioning your life decisions
Over the holidays we have been in the process of adding a new git repository as an NPM package to our build step for a number of projects. This can be problematic in a pipeline, since for git you also have to manage things like SSH keys, and in a pipeline most often the environment may be obscured to where you can’t really add them.
This situation caused a number of errors in our build step that we (incorrectly) assumed were caused by a bad underlying server. Turns out… we just played ourselves.
The authenticity of host ‘(your git host)’ can’t be established.
The first of this comedy of errors came from this wonderful prompt. You’ve probably seen it before, the first time you connect to a new git host. Usually you can just say “Yes, git overlords, I accept your almighty fingerprint” and we all move on with our lives. But in a container in our build step, it’s not an interactive shell. So instead, it just hangs there forever until someone wonders why that deploy never happened, and checks on it.
After only 94 deploy attempts in an effort to figure this out, we finally realized two things:
npm installwas taking place in a cached build step (that our deploy system conveniently placed at the very bottom of the configuration page instead of, you know, before the build steps).
- All our attempts to fix the issue were being placed in the actual build step which takes place after the
npm installand were therefore fruitless.
Anyways, once we figured that simple piece of wisdom out, we were able to resolve it by adding this line before the npm install:
mkdir -p -m 0600 ~/.ssh && ssh-keyscan <your git host> > ~/.ssh/known_hosts
Could not create leading directories
The next error we encountered was something that was probably changed on the underlying server but we can’t be certain. All of a sudden, public git packages started giving an error because git couldn’t write to a cache/tmp directory — the intermediary directories didn’t exist first.
[?25hnpm ERR! code 128 npm ERR! Command failed: git clone --depth=1 -q -b v1.4.1 git://github.com/hubspot/messenger.git /root/.npm/_cacache/tmp/git-clone-2e2bbd46 npm ERR! fatal: could not create leading directories of '/root/.npm/_cacache/tmp/git-clone-2e2bbd46': Permission denied
The issue in this case was that the user wasn’t able to create this new directory for the git clone, because the parent directory(ies) didn’t exist, or the user didn’t have permissions to write to them. As this wasn’t an issue before, we believe the issue was that the directory permissions for executables on the underlying server. Ultimately what fixed it was changing the prefix for npm to somewhere that both exists and is writeable:
npm config set prefix /usr/local
You don’t exist, go away!
And finally, this supremely unhelpful error. In doing some research, this is actually an SSH error. It occurs when you’re trying to SSH as a user ID that doesn’t exist. So like, I guess it makes sense in that exact situation. But, our user is “root” and it definitely exists. If it didn’t, this whole environment would probably collapse in on itself.
[?25hnpm ERR! Error while executing:output npm ERR! /usr/bin/git ls-remote -h -t <your git host> npm ERR! npm ERR! You don't exist, go away! npm ERR! fatal: The remote end hung up unexpectedly npm ERR! npm ERR! exited with error code: 128
This error presented itself when trying to install a private git repository as a NPM package for the first time (for this particular app and container).
After about 59 tries to figure out what exactly was wrong with the user, container, and anything else in the environment, we finally noticed something different in this project’s package.json file — it was doing the
npm install with the “global”
-g flag. Thinking back to the last issue, I decided to try to change the prefix (which I had already tried, and it didn’t help), but this time with the
-g flag as well.
npm config set -g prefix /usr/local
Like magic, it worked.
Build steps can be a frustrating troubleshooting environment. When you don’t have access to the server itself, it can be cumbersome and loud to try to find the cause of errors. And, those errors don’t always present themselves in the same way. Most of these errors did not occur when testing from the same container on local. And, many of these errors produced little to no results in doing a google search. I hope this article helps some weary DevOps souls out there! Feel free to comment with other weird build step issues you’ve encountered as well, or contact me.