Skip to main content

Hello all! 

Thought we could take a deep dive into one of our Operators, in this case….

 

The * Wildcard Operator

The purpose of the Wildcard operator is to find mentions that include the root of a word. It’s really useful for finding lots of variations on a word, without having to include them all in your query.

For example, I could write a query that looks like this:

Kiss OR Kisses OR Kissing OR Kissed OR Kisser

But using the Wildcard operator, I can shorten that query down to:

Kiss*

The use of the * here instructs Brandwatch to find any word that contains the root Kiss. 

 

Placement is everything with the Wildcard!

Where you place the Wildcard operator in a word has a big impact on the output. For example, if I were to move the * towards the beginning of the word Kiss like this:

Ki*

Then this instruction tells Brandwatch to find any word that starts with “Ki”. Kids, Kin, Kim, Kissing, Kilogram, Kindling and many more would be captured by this query.

Similarly, I can restrict the number of words found by moving the Wildcard further towards the end of the word. If I write a query that looks for:

Kisse*

Then Brandwatch will find words like Kissed, Kisser and Kisses - but not Kissing.

 

Question Mark ? Operator

The purpose of the question mark operator ? is as a mechanism to complete certain words, for example:

  • Misspelled words: bel??ve OR sep?rate
  • Language differences e.g. English UK vs US: customi?e

It isn’t so useful when letters are mistakenly added to create a misspelled word, such as MacDonalds. In this case, you should add MacDonalds to you query directly.

  • ? has to be something, it can’t be nothing. It’s a character replacement. Unlike the * Wildcard, which can be anything including nothing.
  • You can’t use ? or * at the start of a word because the system has to start somewhere to match a word. If we have no letter at the start of the word, the system can’t “start” its search as it doesn’t know what it is matching.

 

Over to you!

Have you used the Wildcard operator in any of your queries so far?

Will you use it more in future?

Let us know below.

 

➡️ See more posts in the Boolean Explained series here.

Using the wildcard operator on the end of hashtags is another great time saver, and allows you find unique variations that your audience uses.


That’s a great shout @Whibs - it’s not just limited to ‘standard’ queries, you can use it with Hashtags too!


Another great one could be the ‘replacement’ operator

analy?e

This will help you find variations of spellings across countries e.g. analyse and analyze


The breakthrough comes when you start using multiple wildcards in a single string, like:

#*topic*

Which will collect both #greattopic and #topicisgreat

 


How do you use wildcards with multiple words so something like “word word*” but where the last word in the duo might have multiple variations?


How do you use wildcards with multiple words so something like “word word*” but where the last word in the duo might have multiple variations?

You can’t do “word word*” because exact match operator does not work with wildcard operator *. However, you can do (word NEAR/0f word*) which would capture the exact scenario you were asking about. Of course, you would need to use exact match operator if your first term is a multi-word phrase, like (“word word” NEAR/0f word*) or (“word-word” NEAR/0f word*).

If this becomes confusing, you can ask with a real life scenario for you and we can take a look.


Hi there

Question re: Japanese language - how does the wild card operator treat Japanese text?

For example, the verb “買う” Ka-u] means purchase, but this can be conjugated into a whole mess of tenses (eg 買ったspast tense]・買ってpgerund]・買いましたpast-polite]・買えたpast-potential]) all stemming from that one root of 買

so I think I could use 買* to capture all of these tenses with one query, but in doing so since there are no spaces in Japanese how does the operator determine where to delineate between the next phoneme unrelated to the conjugation?

Thank you


Hi @John Chaparro. I did a bit of digging and found a couple of bits of info which may be useful to you:

  • We divide Japanese text into single characters, so NEAR will mean distance in characters (not words).
  • Japanese and Chinese text is tokenised (i.e. split into 'words') with a boundary between each character. So adding spaces between characters makes no difference to the splitting because it would have split there anyway.

With regards to your Wildcard question specifically, I have reached out to a colleague and will get back to you as soon as they do.

@Momoha Obara are you able to help? 

These topics may be of interest too:

 


  • We divide Japanese text into single characters, so NEAR will mean distance in characters (not words).
  • Japanese and Chinese text is tokenised (i.e. split into 'words') with a boundary between each character. So adding spaces between characters makes no difference to the splitting because it would have split there anyway.

 

This is very interesting insight. I assume this also applies to Korean Hangul? But does it also apply to Abjad (Arabic, Persian, Pashto, etc) and Ge’ez-derivatives (like Amharic, Oromo, Tigrinya, etc)?

 

We occasionally use word NEAR/X word*  combinations in all these languages and if they are not working as intended, it would be great if we can hear about best practices on those.


Hi @John Chaparro. I did a bit of digging and found a couple of bits of info which may be useful to you:

  • We divide Japanese text into single characters, so NEAR will mean distance in characters (not words).
  • Japanese and Chinese text is tokenised (i.e. split into 'words') with a boundary between each character. So adding spaces between characters makes no difference to the splitting because it would have split there anyway.

With regards to your Wildcard question specifically, I have reached out to a colleague and will get back to you as soon as they do.

@Momoha Obara are you able to help? 

These topics may be of interest too:

 

 

Hello @Ian Ferguson,

Very interesting inputs thanks!

Still, we are not sure how to use wildcards with Japanese language:

  • should we add quotes? (I don’t think so?)
  • should we then use it as for the roman languages? But if so, as @John Chaparro mentioned above, will it work?

Would be great if by any chance you had some feedback from your team!

 

Thanks

 


Hi @John Chaparro. I did a bit of digging and found a couple of bits of info which may be useful to you:

  • We divide Japanese text into single characters, so NEAR will mean distance in characters (not words).
  • Japanese and Chinese text is tokenised (i.e. split into 'words') with a boundary between each character. So adding spaces between characters makes no difference to the splitting because it would have split there anyway.

With regards to your Wildcard question specifically, I have reached out to a colleague and will get back to you as soon as they do.

@Momoha Obara are you able to help? 

These topics may be of interest too:

 

 

Hello @Ian Ferguson,

Very interesting inputs thanks!

Still, we are not sure how to use wildcards with Japanese language:

  • should we add quotes? (I don’t think so?)
  • should we then use it as for the roman languages? But if so, as @John Chaparro mentioned above, will it work?

Would be great if by any chance you had some feedback from your team!

 

Thanks

 

 

@JCS  One for: 

 


Reply