Read First: FAQ / Common Issues / Troubleshooting Guide · explosion/spaCy · Discussion #8226 (original) (raw)

This is a guide to issues that multiple people have experienced. Some of them are basic Python environment issues, some of them are recent bugs in or out of spaCy we're dealing with, and some we're not sure about yet.

Be sure to also check out the troubleshooting section in the docs!

How to ask a good question

If your question isn't answered here, feel free to open a new discussion. Some things to keep in mind to help us help you:

Are you using an old spaCy version?

Be sure to upgrade to the most recent spaCy version to get fixes for older bugs. If you're using v2.x, we're still releasing updates for it, so try those too. If you can't upgrade your version for some reason, let us know why so we understand your situation better.

Installing trained pipelines

Jupyter and Google Colab

I installed spaCy in Jupyter but I get the error "no such module"

GPU Issues on Google Colab

GPU Support

I have an AMD card...

We do not currently have a guide for this. Cupy has experimental support for AMD GPUs you can try. If you've gotten spaCy working with an AMD card, please let us know! You can open a Discussion on the topic.

Training a ML model

Check the Example Projects!

Did you know we have example projects? These are complete examples of using the NER, Text Categorizer, Entity Linking, and other models. They include all necessary training data, so you can check out the code, train a model, and use their configs and conversion scripts as reference for your own models. If you have a question about how everything fits together in practice, check here first!

Preprocessing Text

My retrained model forgot pretrained entities

Can I continue training from the latest epoch of the previous training run?

I'm having trouble with binary classification in textcat

I want to add non-textual features

As of 3.2, nlp() and nlp.pipe() accept Docs as input, so what you can do is create a simple Doc and add your data as custom attributes, and then pass that Doc to another pipeline.

Here are some older workarounds for this:

What are iterations, steps, epochs....? When does training stop?

Hyperparameters

Performance

Incorrect pre-trained model predictions

spaCy is too slow

I'm getting Out of Memory errors

Windows

I get the error Microsoft Visual C++ 14.0 is required

I get the error ImportError: DLL load failed: The specified module could not be found.