Triton server crash during NLP intent inference (original) (raw)

August 29, 2024, 5:44pm 1

Hey team, I’m trying to run a text classification NLP task, but when I download a sample model from NGC, the Triton server crashes.

This is how to reproduce the issue:

  1. I’m able to run riva_init.sh and riva_start.sh with default config.sh succesfully.
  2. Download the RIVA Intent Slot model from NGC. Apparently, the quick start for embedded does not include NLP models, so I have to download this one separately. (If there is an easier way to try text classification, please advise.)
  3. Then I follow these instructions to deploy the .riva model. First, I launch the ServiceMaker image:
docker run --gpus all -it --rm -v /ssd/code/riva_models:/servicemaker-dev \
    -v /ssd/code/riva_quickstart_arm64_v2.16.0/model_repository:/data \
    --entrypoint="/bin/bash" nvcr.io/nvidia/riva/riva-speech:2.16.0-servicemaker-l4t-aarch64
  1. Then I do riva-build:
riva-build intent_slot \
    /servicemaker-dev/domain_model_misty.rmir:tlt_encode \
    /servicemaker-dev/domain_model_misty.riva:tlt_encode
  1. The I do riva-deploy:
riva-deploy /servicemaker-dev/domain_model_misty.rmir:tlt_encode /data/models
  1. At this point, I can exit the ServiceMaker container, and I can relaunch Riva. I can see the model being loaded successfully.
I0829 17:16:16.099950 20 http_server.cc:282] Started Metrics Service at 0.0.0.0:8002
I0829 17:16:23.944136    22 model_registry.cc:143] Successfully registered: conformer-en-US-asr-streaming-asr-bls-ensemble for ASR Triton URI: localhost:8001
I0829 17:16:23.996660    22 model_registry.cc:143] Successfully registered: riva-punctuation-en-US for NLP Triton URI: localhost:8001
I0829 17:16:24.008618    22 model_registry.cc:143] Successfully registered: riva_intent_default for NLP Triton URI: localhost:8001
I0829 17:16:24.038719    22 model_registry.cc:143] Successfully registered: riva-punctuation-en-US for NLP Triton URI: localhost:8001
I0829 17:16:24.050558    22 model_registry.cc:143] Successfully registered: riva_intent_default for NLP Triton URI: localhost:8001
I0829 17:16:24.068651    22 model_registry.cc:143] Successfully registered: fastpitch_hifigan_ensemble-English-US for TTS Triton URI: localhost:8001
I0829 17:16:24.138864    22 riva_server.cc:171] Riva Conversational AI Server listening on 0.0.0.0:50051
  1. Then I use the Python client to launch the following text classification request:
result = nlp_service.classify_text(
    input_strings=["Do I need an umbrella today?", "Tell me a joke."],
    model_name="riva_intent_default",
    language_code="en-US",
)
  1. The server crashes with the following logs:
I0829 17:16:44.314632   187 grpc_riva_nlp.cc:52] NLPService.ClassifyText called for riva_intent_default.
Signal (11) received.
 0# 0x0000AAAAE986BCCC in tritonserver
 1# __kernel_rt_sigreturn in linux-vdso.so.1

E0829 17:16:44.801805   187 client_object.cc:116] error: failed to do inference: Socket closed
/opt/riva/bin/start-riva: line 59:    20 Segmentation fault      (core dumped) <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mrow><mi>C</mi><mi>U</mi><mi>S</mi><mi>T</mi><mi>O</mi><msub><mi>M</mi><mi>T</mi></msub><mi>R</mi><mi>I</mi><mi>T</mi><mi>O</mi><msub><mi>N</mi><mi>E</mi></msub><mi>N</mi><mi>V</mi></mrow><mi>t</mi><mi>r</mi><mi>i</mi><mi>t</mi><mi>o</mi><mi>n</mi><mi>s</mi><mi>e</mi><mi>r</mi><mi>v</mi><mi>e</mi><mi>r</mi><mo>−</mo><mo>−</mo><mi>l</mi><mi>o</mi><mi>g</mi><mo>−</mo><mi>v</mi><mi>e</mi><mi>r</mi><mi>b</mi><mi>o</mi><mi>s</mi><mi>e</mi><mo>=</mo><mn>0</mn><mo>−</mo><mo>−</mo><mi>d</mi><mi>i</mi><mi>s</mi><mi>a</mi><mi>b</mi><mi>l</mi><mi>e</mi><mo>−</mo><mi>a</mi><mi>u</mi><mi>t</mi><mi>o</mi><mo>−</mo><mi>c</mi><mi>o</mi><mi>m</mi><mi>p</mi><mi>l</mi><mi>e</mi><mi>t</mi><mi>e</mi><mo>−</mo><mi>c</mi><mi>o</mi><mi>n</mi><mi>f</mi><mi>i</mi><mi>g</mi></mrow><annotation encoding="application/x-tex">{CUSTOM_TRITON_ENV} tritonserver --log-verbose=0 --disable-auto-complete-config </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.8333em;vertical-align:-0.15em;"></span><span class="mord"><span class="mord mathnormal" style="margin-right:0.07153em;">C</span><span class="mord mathnormal" style="margin-right:0.10903em;">U</span><span class="mord mathnormal" style="margin-right:0.02778em;">STO</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.10903em;">M</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.13889em;">T</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.00773em;">R</span><span class="mord mathnormal" style="margin-right:0.07847em;">I</span><span class="mord mathnormal" style="margin-right:0.02778em;">TO</span><span class="mord"><span class="mord mathnormal" style="margin-right:0.10903em;">N</span><span class="msupsub"><span class="vlist-t vlist-t2"><span class="vlist-r"><span class="vlist" style="height:0.3283em;"><span style="top:-2.55em;margin-left:-0.109em;margin-right:0.05em;"><span class="pstrut" style="height:2.7em;"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mathnormal mtight" style="margin-right:0.05764em;">E</span></span></span></span><span class="vlist-s">​</span></span><span class="vlist-r"><span class="vlist" style="height:0.15em;"><span></span></span></span></span></span></span><span class="mord mathnormal" style="margin-right:0.10903em;">N</span><span class="mord mathnormal" style="margin-right:0.22222em;">V</span></span><span class="mord mathnormal">t</span><span class="mord mathnormal" style="margin-right:0.02778em;">r</span><span class="mord mathnormal">i</span><span class="mord mathnormal">t</span><span class="mord mathnormal">o</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.02778em;">ser</span><span class="mord mathnormal" style="margin-right:0.03588em;">v</span><span class="mord mathnormal" style="margin-right:0.02778em;">er</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord">−</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">o</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6944em;"></span><span class="mord mathnormal" style="margin-right:0.03588em;">v</span><span class="mord mathnormal" style="margin-right:0.02778em;">er</span><span class="mord mathnormal">b</span><span class="mord mathnormal">ose</span><span class="mspace" style="margin-right:0.2778em;"></span><span class="mrel">=</span><span class="mspace" style="margin-right:0.2778em;"></span></span><span class="base"><span class="strut" style="height:0.7278em;vertical-align:-0.0833em;"></span><span class="mord">0</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.7778em;vertical-align:-0.0833em;"></span><span class="mord">−</span><span class="mord mathnormal">d</span><span class="mord mathnormal">i</span><span class="mord mathnormal">s</span><span class="mord mathnormal">ab</span><span class="mord mathnormal" style="margin-right:0.01968em;">l</span><span class="mord mathnormal">e</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.6984em;vertical-align:-0.0833em;"></span><span class="mord mathnormal">a</span><span class="mord mathnormal">u</span><span class="mord mathnormal">t</span><span class="mord mathnormal">o</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">co</span><span class="mord mathnormal">m</span><span class="mord mathnormal" style="margin-right:0.01968em;">pl</span><span class="mord mathnormal">e</span><span class="mord mathnormal">t</span><span class="mord mathnormal">e</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.8889em;vertical-align:-0.1944em;"></span><span class="mord mathnormal">co</span><span class="mord mathnormal">n</span><span class="mord mathnormal" style="margin-right:0.10764em;">f</span><span class="mord mathnormal">i</span><span class="mord mathnormal" style="margin-right:0.03588em;">g</span></span></span></span>model_repos --cuda-memory-pool-byte-size=0:1000000000
One of the processes has exited unexpectedly. Stopping container.
W0829 17:16:49.018988    22 riva_server.cc:195] Signal: 15

How do I fix this crash? Alternatively, is there another intent classification model I can use to test basic functionality? Thanks, team, for your support.

Hi @zugaldia ,
Let m etry a repro for this and share an update.

zugaldia August 30, 2024, 9:59am 3

Thank you @AakankshaS , let me know if you need any additional information.

zugaldia September 3, 2024, 5:05pm 4

Any luck reproducing @AakankshaS ? Or any hints on how to test text classification with Riva otherwise?