Linstor creates invalid configuration files when required ext tools are missing on a node

sewi · September 3, 2025, 6:06am

Not very critical, since it requires you to be in a situation where a node is in the state “required ext tools missing”, but I feel this could be handled better.

Three nodes:

node1: OK
node2: OK
node3: required_ext_tools_missing

node3 functions as tiebreaker only. All that on Proxmox.

It seems that in this situation, the *.res files generated on node1 and node2 appear to be generated with an empty hostname in place of where it should say “node3”. The effect being that no disk operations in Proxmox were possible, and a node that would get rebooted would no longer be able to read the configuration files.

This was quickly fixed with

sed -i 's/on ""/on "node3"/g' /var/lib/linstor.d/*.res
sed -i 's/host ""/host "node3"/g' /var/lib/linstor.d/*.res

but it would be neater if it didn’t do that

I fixed the required_ext_tools_missing, rebooted the node, had to re-run the sed command, because it generated the configuration files again with no node3 host, but after that it ran stable once again.

PS.: the documentation could also be updated so that, for example, the different node stati in the linstor controller are listed. I don’t know the exact wording anymore, but it was something like REQ_EXT_TOOL_MISSING or similar; it looked like a constant. At least using Google there was no documentation for that status, except for one commit, where that status got introduced. I think it would help to just have a list of all possible stati and what they mean

rp9 · September 4, 2025, 11:20am

Hi!

Could you be a bit more specific?

Like what external tool was missing? What error-reports, satellite logs?

sewi · September 4, 2025, 6:36pm

I would like to be more specific, but the node status was just that - required external tool missing. It read that instead of “OK” on that satellite, at the /ui/#/inventory/nodes URI of the LINSTOR controller. I can’t find the exact term, but it was something like REQ_EXT_TOOL_MISSING or such.

In the error logs I do find an error reading

Received a resource that requires DRBD9_KERNEL but that external tool is not supported on this satellite (MissingRequiredExtToolsStorageException)

so it could be that. There’s also a

Cannot run program "cat": error=0, Failed to exec spawn helper: pid: 3506215, exit value: 1 (IOException)

error. The system required a reboot, which fixed the error. Which is fine.

The problem is, that the resource files on the other two nodes had an empty host name. The .res files all looked like this

# This file was generated by LINSTOR (1.31.3), do not edit manually.
# Name
#   LINSTOR nodename: node1
#   Local hostname  : node1
# File generated at:
#   Local time      : 2025-09-02 21:14:06
#   UTC             : 2025-09-02 19:14:06

resource "pm-12345678"
{

[...]

    on ""
    {
        volume 0
        {
            disk        none;
            disk
            {
                discard-zeroes-if-aligned yes;
                rs-discard-granularity 1048576;
            }
            meta-disk   internal;
            device      minor 1051;
        }
        node-id    2;
    }

[...]
    connection
    {

        net
        {
            allow-two-primaries yes;
            protocol C;
        }

        disk
        {
            c-max-rate 0;
            c-min-rate 0;
        }
        host "node1" address ipv4 10.1.1.1:7051;
        host "" address ipv4 10.1.1.3:7051;
    }
}

Which, again, is something quick to fix, but it would be neater if it didn’t do that, because if you reboot any of the other nodes, they won’t be able to parse the .res file and wouldn’t be able to bring their resources up. I guess instead of an empty string you could just put anything in the hostname, so at least the configuration files aren’t invalid on the two functioning nodes

rp9 · September 5, 2025, 5:53am

We have already seen this error and it is triggered if the Satellite was started and afterwards some packages upgrades are done to the jre. After that spawning any external command doesn’t work anymore, which makes the Satellite completely useless, but should be easily fixed by a satellite restart.

My guess is that then the satellite couldn’t report back its real hostname and for some reason Linstor wrote ““ into the .res file.

sewi · September 5, 2025, 7:25am

Absolutely. I’m just saying, LINSTOR shouldn’t write ““ into the configuration file, because it makes it invalid and - at least as long as you don’t manually edit them - useless to the other working nodes.

I suppose write anything else but an empty string so that at least it parses? 'invalid-hostname-10.1.1.3’? Seems like an easy fix and it increases the robustness of your solution

Topic		Replies	Views
Linstor Generated Files LINSTOR drbd , linstor	0	69	February 4, 2025
Unnecessary resource is created when the satellite is turned off LINBIT SDS Integrations linstor , proxmox	2	68	July 22, 2025
Unable to Freshly Reinstall Linstor Proxmox VE drbd , linstor	4	158	February 24, 2025
Installed Linstor per documentation but Proxmox fails to create a Linstor Resource Proxmox VE drbd , proxmox	4	236	June 11, 2025
Linstor Failure on specific node when switchover LINSTOR	2	310	August 9, 2024

Linstor creates invalid configuration files when required ext tools are missing on a node

Related topics