How can I recover from failed storage?

I’m testing out some storage and scenarios with linstor right now on Proxmox.

During this process I had a zfs pool fail entirely and seem to have hit a catch 22 I can’t seem to get myself out of.

I had only 1 resource on the failed zfs pool called “hdd” called pm-7688d510.

The hdd pool powers a linstor storage pool called “BulkPool” which then has a resource group on it called RG-BulkPool. My resource pm-7688d510 is apart of that pool.

However because the zfs pool is down and out it seems completely impossible to delete my resource and I cannot for the life of me find a way to declare this storage pool and resource lost so it can be force deleted.

Is anyone able to advise how I can delete this resource so I can continue the recovery/rebuild process?

Any help would be very much appreciated as I seem to be unable to find this answer on my own.

See below for the output of the various commands used and errors:

linstor resource-definition delete pm-7688d510

SUCCESS:
Description:
    Resource definition 'pm-7688d510' marked for deletion.
Details:
    Resource definition 'pm-7688d510' UUID is: bac860f6-b188-401f-b6a3-cc09c6194e07
ERROR:
Description:
    (Node: 'Ennead') No response generated by handler.
Details:
    In API call 'ChangedRsc'.

linstor resource delete Ennead pm-7688d510

SUCCESS:
Description:
    Node: Ennead, Resource: pm-7688d510 preparing for deletion.
Details:
    Node: Ennead, Resource: pm-7688d510 UUID is: 67fdb7ad-c169-41c5-b7f3-fe3fbb1b8d33
ERROR:
Description:
    (Node: 'Ennead') No response generated by handler.
Details:
    In API call 'ChangedRsc'.
ERROR:
Description:
    Deletion of resource 'pm-7688d510' on node 'Ennead' failed due to an unknown exception.
Details:
    Node: Ennead, Resource: pm-7688d510
Show reports:
    linstor error-reports show 67CDE870-00000-000006

linstor error-reports show 67CDE870-00000-000006

ERROR REPORT 67CDE870-00000-000006

============================================================

Application:                        LINBIT® LINSTOR
Module:                             Controller
Version:                            1.30.4
Build ID:                           bef74a44609cb592c5efad2e707b50e696623c61
Build time:                         2025-02-03T15:48:28+00:00
Error time:                         2025-03-09 16:15:43
Node:                               Ennead
Thread:                             MainWorkerPool-19
Access context information

Identity:                           PUBLIC
Role:                               PUBLIC
Domain:                             PUBLIC

Peer:                               RestClient(192.168.0.8; 'PythonLinstor/1.24.0 (API1.0.4): Client 1.24.0')

============================================================

Reported error:
===============

Category:                           RuntimeException
Class name:                         DelayedApiRcException
Class canonical name:               com.linbit.linstor.core.apicallhandler.response.CtrlResponseUtils.DelayedApiRcException
Generated at:                       Method 'lambda$mergeExtractingApiRcExceptions$6', Source file 'CtrlResponseUtils.java', Line #188

Error message:                      Exceptions have been converted to responses

Error context:
        Deletion of resource 'pm-7688d510' on node 'Ennead' failed due to an unknown exception.
Asynchronous stage backtrace:
        (Node: 'Ennead') No response generated by handler.

    Error has been observed at the following site(s):
        *__checkpoint ⇢ Prepare resource delete
        *__checkpoint ⇢ Activating resource if necessary before deletion
    Original Stack Trace:

Call backtrace:

    Method                                   Native Class:Line number
    lambda$mergeExtractingApiRcExceptions$6  N      com.linbit.linstor.core.apicallhandler.response.CtrlResponseUtils:188

Suppressed exception 1 of 2:
===============
Category:                           RuntimeException
Class name:                         ApiRcException
Class canonical name:               com.linbit.linstor.core.apicallhandler.response.ApiRcException
Generated at:                       Method 'handleAnswer', Source file 'CommonMessageProcessor.java', Line #344

Error message:                      (Node: 'Ennead') No response generated by handler.

Error context:
        Deletion of resource 'pm-7688d510' on node 'Ennead' failed due to an unknown exception.
Call backtrace:

    Method                                   Native Class:Line number
    handleAnswer                             N      com.linbit.linstor.proto.CommonMessageProcessor:344
    handleDataMessage                        N      com.linbit.linstor.proto.CommonMessageProcessor:297
    doProcessInOrderMessage                  N      com.linbit.linstor.proto.CommonMessageProcessor:245
    lambda$doProcessMessage$4                N      com.linbit.linstor.proto.CommonMessageProcessor:230
    subscribe                                N      reactor.core.publisher.FluxDefer:46
    subscribe                                N      reactor.core.publisher.Flux:8773
    onNext                                   N      reactor.core.publisher.FluxFlatMap$FlatMapMain:427
    drainAsync                               N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:453
    drain                                    N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:724
    onNext                                   N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:256
    drainFused                               N      reactor.core.publisher.SinkManyUnicast:319
    drain                                    N      reactor.core.publisher.SinkManyUnicast:362
    tryEmitNext                              N      reactor.core.publisher.SinkManyUnicast:237
    tryEmitNext                              N      reactor.core.publisher.SinkManySerialized:100
    processInOrder                           N      com.linbit.linstor.netcom.TcpConnectorPeer:422
    doProcessMessage                         N      com.linbit.linstor.proto.CommonMessageProcessor:228
    lambda$processMessage$2                  N      com.linbit.linstor.proto.CommonMessageProcessor:165
    onNext                                   N      reactor.core.publisher.FluxPeek$PeekSubscriber:185
    runAsync                                 N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:440
    run                                      N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:527
    call                                     N      reactor.core.scheduler.WorkerTask:84
    call                                     N      reactor.core.scheduler.WorkerTask:37
    run                                      N      java.util.concurrent.FutureTask:264
    run                                      N      java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask:304
    runWorker                                N      java.util.concurrent.ThreadPoolExecutor:1136
    run                                      N      java.util.concurrent.ThreadPoolExecutor$Worker:635
    run                                      N      java.lang.Thread:840

Suppressed exception 2 of 2:
===============
Category:                           RuntimeException
Class name:                         OnAssemblyException
Class canonical name:               reactor.core.publisher.FluxOnAssembly.OnAssemblyException
Generated at:                       Method 'lambda$mergeExtractingApiRcExceptions$6', Source file 'CtrlResponseUtils.java', Line #188

Error message:
Error has been observed at the following site(s):
        *__checkpoint ⇢ Prepare resource delete
        *__checkpoint ⇢ Activating resource if necessary before deletion
Original Stack Trace:

Error context:
        Deletion of resource 'pm-7688d510' on node 'Ennead' failed due to an unknown exception.
Call backtrace:

    Method                                   Native Class:Line number
    lambda$mergeExtractingApiRcExceptions$6  N      com.linbit.linstor.core.apicallhandler.response.CtrlResponseUtils:188
    subscribe                                N      reactor.core.publisher.FluxDefer:46
    subscribe                                N      reactor.core.publisher.Flux:8773
    onComplete                               N      reactor.core.publisher.FluxConcatArray$ConcatArraySubscriber:258
    onComplete                               N      reactor.core.publisher.FluxMap$MapSubscriber:144
    checkTerminated                          N      reactor.core.publisher.FluxFlatMap$FlatMapMain:847
    drainLoop                                N      reactor.core.publisher.FluxFlatMap$FlatMapMain:609
    innerComplete                            N      reactor.core.publisher.FluxFlatMap$FlatMapMain:895
    onComplete                               N      reactor.core.publisher.FluxFlatMap$FlatMapInner:998
    onComplete                               N      reactor.core.publisher.Operators$MultiSubscriptionSubscriber:2205
    request                                  N      reactor.core.publisher.Operators$ScalarSubscription:2547
    set                                      N      reactor.core.publisher.Operators$MultiSubscriptionSubscriber:2341
    onSubscribe                              N      reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber:74
    subscribe                                N      reactor.core.publisher.FluxJust:68
    subscribe                                N      reactor.core.publisher.Flux:8773
    onError                                  N      reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber:103
    onError                                  N      reactor.core.publisher.FluxMap$MapSubscriber:134
    onError                                  N      reactor.core.publisher.FluxConcatArray$ConcatArraySubscriber:207
    onError                                  N      reactor.core.publisher.FluxPeek$PeekSubscriber:222
    onError                                  N      reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber:106
    error                                    N      reactor.core.publisher.Operators:198
    subscribe                                N      reactor.core.publisher.FluxError:43
    subscribe                                N      reactor.core.publisher.Flux:8773
    onError                                  N      reactor.core.publisher.FluxOnErrorResume$ResumeSubscriber:103
    onError                                  N      reactor.core.publisher.FluxMapFuseable$MapFuseableSubscriber:142
    onError                                  N      reactor.core.publisher.FluxUsing$UsingSubscriber:217
    onError                                  N      reactor.core.publisher.Operators$MultiSubscriptionSubscriber:2210
    error                                    N      reactor.core.publisher.FluxCreate$BaseSink:474
    drain                                    N      reactor.core.publisher.FluxCreate$BufferAsyncSink:802
    error                                    N      reactor.core.publisher.FluxCreate$BufferAsyncSink:747
    drainLoop                                N      reactor.core.publisher.FluxCreate$SerializedFluxSink:237
    drain                                    N      reactor.core.publisher.FluxCreate$SerializedFluxSink:213
    error                                    N      reactor.core.publisher.FluxCreate$SerializedFluxSink:189
    apiCallError                             N      com.linbit.linstor.netcom.TcpConnectorPeer:502
    handleAnswer                             N      com.linbit.linstor.proto.CommonMessageProcessor:356
    handleDataMessage                        N      com.linbit.linstor.proto.CommonMessageProcessor:297
    doProcessInOrderMessage                  N      com.linbit.linstor.proto.CommonMessageProcessor:245
    lambda$doProcessMessage$4                N      com.linbit.linstor.proto.CommonMessageProcessor:230
    subscribe                                N      reactor.core.publisher.FluxDefer:46
    subscribe                                N      reactor.core.publisher.Flux:8773
    onNext                                   N      reactor.core.publisher.FluxFlatMap$FlatMapMain:427
    drainAsync                               N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:453
    drain                                    N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:724
    onNext                                   N      reactor.core.publisher.FluxFlattenIterable$FlattenIterableSubscriber:256
    drainFused                               N      reactor.core.publisher.SinkManyUnicast:319
    drain                                    N      reactor.core.publisher.SinkManyUnicast:362
    tryEmitNext                              N      reactor.core.publisher.SinkManyUnicast:237
    tryEmitNext                              N      reactor.core.publisher.SinkManySerialized:100
    processInOrder                           N      com.linbit.linstor.netcom.TcpConnectorPeer:422
    doProcessMessage                         N      com.linbit.linstor.proto.CommonMessageProcessor:228
    lambda$processMessage$2                  N      com.linbit.linstor.proto.CommonMessageProcessor:165
    onNext                                   N      reactor.core.publisher.FluxPeek$PeekSubscriber:185
    runAsync                                 N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:440
    run                                      N      reactor.core.publisher.FluxPublishOn$PublishOnSubscriber:527
    call                                     N      reactor.core.scheduler.WorkerTask:84
    call                                     N      reactor.core.scheduler.WorkerTask:37
    run                                      N      java.util.concurrent.FutureTask:264
    run                                      N      java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask:304
    runWorker                                N      java.util.concurrent.ThreadPoolExecutor:1136
    run                                      N      java.util.concurrent.ThreadPoolExecutor$Worker:635
    run                                      N      java.lang.Thread:840


END OF ERROR REPORT.

linstor storage-pool list

StoragePool Node Driver PoolName FreeCapacity TotalCapacity CanSnapshots State SharedName
BulkPool Ennead ZFS_THIN hdd 0 KiB 0 KiB True Error Ennead;BulkPool

linstor resource-group list

ResourceGroup SelectFilter VlmNrs Description
RG-BulkPool PlaceCount: 1 0
StoragePool(s): BulkPool
DisklessOnRemaining: True

linstor resource list

ResourceName Node Layers Usage Conns State CreatedOn
pm-7688d510 Ennead DRBD,STORAGE Ok DELETING 2025-03-06 22:49:56

linstor resource-definition list

ResourceName Port ResourceGroup Layers State
pm-7688d510 7006 RG-BulkPool DRBD,STORAGE DELETING

I was actually able to find a work around by recreating my zfs pool hdd so it was accessible again and then rerunning the linstor rd delete pm-7688d510 command.

However if there is not already there should be a way to clear up lost storages. If there already a method I’d really appreciate hearing it.