Wordalizer | Frequently Asked Questions [OBSOLETE]
January 11, 2011 | Wordalizer | en | fr
This FAQ is now obsolete. Please check out the new version of Wordalizer.
Want to improve your mastery of Wordalizer and become the King of InDesign Word Clouds? Well! This FAQ reveals the obscure features of the script, provides advanced tips and tricks and helps you to understand limitations and specific issues.
Main Features
• I currently use InDesign CS4 (Mac OS X) but I plan to upgrade to CS5 soon. If I purchase Wordalizer PRO today, will my license work on ID CS5?
[PRO VERSION]
Yes. Wordalizer is provided as a single JavaScript file that you just need to put in the Scripts Panel folder. The script is designed to work in both CS4 and CS5 environments.
[Wordalizer Pro tested in InDesign CS5, French UI. Thanks to Stéphane Baril from Adobe France.]
• What means the option labeled “Parse Case” in the Parser panel?
[BOTH VERSIONS]
The “Parse Case” box is checked by default. It tells Wordalizer to detect and gather the case variants of a same word during the parsing process. For example, if the text contains several occurrences of installation
, Installation
, and/or INSTALLATION
, the parser records installation
as the prototypal form and counts all occurrences under this entry. Occasionally you may need to distinguish case variants (e.g. Bill
vs. bill
). Then uncheck “Parse Case”.
Note that the “Parse Case” option is disabled when you check “Uppercase” in the Characters panel because all words are then converted into their uppercase form:
• What is the difference between “Previous word list” and “Active word cloud” in the Source panel?
[PRO VERSION]
When you run Wordalizer PRO repeatedly during the same InDesign session, the script keeps in memory your current preferences, including the word list it has previously built. This allows you to skip the parsing process if you work with the same text source, and to fine tune the cloud design from the latest settings. In addition, each time a word cloud is generated, Wordalizer permanently saves its own settings in the resulting InDesign document. So for example you can backup a cloud on your hard disk, quit InDesign, then work later from that document settings by reopening the file —which makes it the “Active word cloud.” Therefore, there are circumstances in which Wordalizer can apply either the preferences of the current session, or those of the active word cloud. In this case, you can choose the desired word list in the Source panel by checking either “Previous word list”, or “Active word cloud”:
When appropriate, Wordalizer also gives you the option to apply the active word cloud settings rather than the session preferences:
• What means the option labeled “Allow Embedded Words” in the Cloud Design panel?
[BOTH VERSIONS]
When “Allow Embedded Words” is checked, Wordalizer is allowed to incorporate words within other words, provided their bounds do not intersect. To achieve this effect, use a heavy solid font and make sure that your lexicon contains low weighted words:
• Why is the Parsel panel disabled when I select “Active word cloud” ?
[PRO VERSION]
Since the parser's role is to extract a list of lexical weighted words from a source text, it has nothing to do if the list is already provided. The option labeled “Active word cloud” causes the script to reuse the word list —the terms and their respective weights— associated with the existing word cloud.
Advanced Tips and Tricks
• How to create a word cloud from a web page?
[BOTH VERSIONS]
First, open the page in your browser. Select and copy (Cmd C
) the text to parse. Open InDesign and run Wordalizer from the Scripts panel. In the Source panel, check the Clipboard radio button. If you are using the PRO version, you can also check the “Edit Weighted List” box in order to manage the words before generating the cloud. Tune the other settings for your purpose and press OK.
• Can I directly provide a “weighted word list” to Wordalizer?
[BOTH VERSIONS]
Yes, both versions offer the ability to supply a short weighted list rather than a text. Note that this hidden feature will not work if the entire list contains more than 1024 characters. The easiest way to provide your data is to create a text frame in InDesign, then to enter the weighted words in the following format:
word1 : weight1 word2 : weight2 word3 : weight3 etc.
where weight[i]
is a positive number.
Here is a simple example:
When Wordalizer detects that the source contains a weighted list, it automatically affects the weights to the corresponding terms and skips the parsing process. Note that this method allows you to inject spaces or exotic characters in your entries. Don't forget to select the appropriate source in the Source panel. Weighted word list can also be provided from the clipboard or plain text file.
• I appreciate the “Rarest Words” feature, but it consumes a longer processing time without providing a sufficient variety of word weights. How to fix this?
[PRO VERSION]
The rarest words tend to share the same weight because they appear once or twice in the text. Consequently, the resulting list does not contain a significant gap between the word weights, which on the one hand bogs down the placement routine, on the other hand reduces the ‘contrast’ of the final word cloud. As a workaround, check the “Edit Weighted List” box in the Source panel and consider to manually change a few weights in the list. For example, if all entries are rated 100, apply randomly some 60, 70, 80, and 90 values. The resulting word cloud will surely look better, although it becomes statistically wrong!
• How to recover the word list of an existing word cloud?
[PRO VERSION]
Open the word cloud document (or activate it), and run Wordalizer. Make sure that “Active Word Cloud” is selected in the Source panel and check the “Edit Weighted List” box. Press OK. You can then select and copy the entire list from the list editor. (Press Cancel to avoid building a new word cloud.)
Limitations
• Will the word cloud be available as editable text (and not vectors)?
[BOTH VERSIONS]
No, it won't. The current version of Wordalizer does not produce editable text frames. We might consider to add a vector-to-text feature in a future release, but it is really not easy to perform since the main algorithm is purely based on vectors!
• Is there a word length limit?
[BOTH VERSIONS]
See below: “I occasionally get an error #55...” in the KNOWN ISSUES section.
• Since Wordalizer TRY doesn't allow to change the source language, what is the default available language?
[TRY VERSION]
It is important to distinguish between the user interface language and the parser language(s) available in the Parser panel. For the time being Wordalizer provides two UI languages (English, French) and supports six Parser languages (English, French, German, Spanish, Portuguese, and Russian). The script UI dialog is displayed in English unless you run Wordalizer from a French installation of InDesign. In other respects, the available Parser language (which is locked in the TRY version) fits your InDesign locale if the language is supported. Otherwise the default selected language is English. For instance, Spanish InDesign users get a Spanish parser although the dialog box is displayed in English. But Finnish users get both the interface and the parser locked for the English language.
• Why does Wordalizer regard ‘dog’ and ‘dogs’ as two different words?
[BOTH VERSIONS]
The text parser is only based on morphological routines. It scans the characters of the supplied text and counts the occurrences of each extracted word. If the “Skip Usual Words” box is checked, it also removes many blacklisted terms, but it does not implement a stemming procedure that would unite the syntactic variants of a word —such as plural, verbal forms, etc. If you use the PRO version, you can manually regroup the words as required by editing the weighted list.
• Will Wordalizer join two words as a single entry if I use a non-breaking space between the words?
[BOTH VERSIONS]
With the exceptions of hyphens, the text parser ignores most of the ‘non-alphabetic’ characters (which depend on the selected language). That's to say that spaces, non-breaking spaces, punctuation, bullets, and any similar characters are regarded as word separators. However the PRO version allows you to insert extra characters through the word list editor, which always supersedes the parser restrictions.
• Why couldn't Wordalizer manage more than 100 words?
[BOTH VERSIONS]
This limitation is empirical. It follows from the fact that Wordalizer requires a lot of computing resources. However, this word counter will probably be increased in future releases.
• Is there a way to add my own color schemes in the Theme list?
[BOTH VERSIONS]
Not yet! When it generates a word cloud for the theme you've selected, Wordalizer adds the required colors in the Swatches panel:
Of course you can control the colors of the word cloud at this level. Now if you really like your customized theme and want to reuse it, an option is to save the swatches in a .ase file so you can easily restore your set later, then apply the Steve Wareham's Swatch Switcher script on future word clouds built with the same theme. (Anyway, I concede that this workaround is not comfortable. . .)
Known Issues
• I occasionally get an error #55, that the method ‘path’ is not supported, and Wordalizer stops!
[BOTH VERSIONS]
This issue as been reported by several users and we are still working to address it. The bug appears when the word list contains an item having about 25 characters or more. It does not seem to be platform-specific. If you encounter this problem with Wordalizer PRO
and need to manage long words, please contact our tech support. We will send you a temporary patch.
• InDesign becomes instable or crashes when I run Wordalizer repeatedly.
[PRO VERSION]
This issue is specifically experienced on Win X64 platforms. It sounds like a JavaScript memory leak, but I have no solution yet. Wordalizer makes an extensive use of InDesign resources, especially in the following situations: many opened documents (simultaneously), use of the “Rarest words” and/or “Logarithmic Weight” features, high number of words to design, use of complex font with many path points. If InDesign loses steam during your session, save your documents and restart the application.
• My source text contains some letters with dot above and/or dot below. The font I select supports those characters but they don't appear in the word cloud.
[BOTH VERSIONS]
Wordalizer supports several character sets which depend on the language you select in the Parser panel. The Latin languages have been adjusted to allow most of the characters used in their typical lexical space. They contain the Unicode blocks referred to as Basic Latin, Latin1 Supplement, Latin Extended-A, and Latin Extended-B (up to U+02AF
). But they do not include additional blocks such a “Latin Extended Additional” (from U+1E00
to U+1EFF
). Adding more Unicode ranges has a performance cost, so I'm looking at a solution that optionally extends the character set.
• For more details about Wordalizer, see the main page:
http://www.indiscripts.com/category/projects/Wordalizer.
Comments
Merci pour cette FAQ excellente…
Wow, loads of great tips -- thanks! I'm now back in Oslo and will soon resume debugging our, umm, er, little problem? ;-)
I need a script that can do this, but can deal with a weighted list about 5000 entries long. Is there any progress on allowing more than 100 words or 1024 bytes in a weighted list?
Ouch!!! 5000 weighted items is really too much for Wordalizer. Take a look at Wordle (www.wordle.net) -- maybe Jonathan Feinberg's magic tool could do that...
Is there any way to get Wordalizer to work with Arabic text?
> Is there any way to get Wordalizer to work
> with Arabic text?
Well, it's theoretically possible. I could extend the parser to include Arabic language and deal with right-to-left features. The main problem —for me— is that I don't know how “lexical entities” work in Arabic. What characters behave as word breaks? Do we need to check special diacritics or combined glyphs to extract reliable words? What about “stop words”?...
Hi Marc
Many thanks for your reply. I am happy to help with adjusting your script to work with Arabic. I am not a programmer but have some experience with issues you mentioned. I am happy to do more research to answer your specific questions if that would help.
I found a link online in relation to Arabic Stop Words at:
http://members.unine.ch/jacques.sav...
I will review the list included.
I also found the following paper regarding Arabic Lecical ENtities http://www.lrec-conf.org/proceeding... - I will read more about lexical entities in order to be able to contribute more to your question.
I tried Wordle with Arabic and it works very well, the problem with Wordle is the limited Arabic fonts they allow (only one or two). I think your script will provide a much flexibility in Arabic as users can use their own licensed fonts.
I believe Wordalizer in Arabic will be a very useful tool especially with the natural calligraphic nature of the written Arabic language which will appeal to many designers.
Being a native Arabic speaker as well, I am willing to help in developing Wordalizer for Arabic in any capacity I can.
This is my email: kg@khaledgalal.com
Please get in touch if your would like to discuss further.
Hi Mark
Another interesting paper...
http://metadata.sims.berkeley.edu/p...
http://www.globalwordnet.org/AWN/
http://sourceforge.net/projects/awn...
As I mentioned in my earlier post I am not a programmer, so maybe I will need to to learn more about your requirements to be able to provide more relevant resources.
Hi Khaled,
Thank you!! I'm studying the links you pointed up. Really exciting stuff, at least on the theoretical side of the challenge.
On the practical side, I'm afraid it is beyond my skills. Wordalizer is based on a pretty basic tokenizer. I don't plan to include a ‘stemmer’ or to implement Wordnet concepts... I think it's too sophisticated for a simple script. All we need is a way to extract lexical entities —I mean: words— from a character stream. Given an Arabic text, what would be the most effective algorithm?
Regards,
Marc
> Take a look at Wordle (www.wordle.net)
While it can handle vast word lists, the primary limitation of wordle is its interaction with Adobe's PDF printer - I simply cannot get the resolution I want out of it, which is 600 dpi for a 36 inch by 48 inch poster.
The command-line tool that IBM supports is almost robust enough to do the job, but not quite. It requires an enormous amount of memory in Java, and needs more than I have (6 GB) to get to 600 dpi - and it fails about 50& of the time at 400 dpi.
The beauty of a system like yours that is that it's a 100% vector-based solution, and will easily scale and allow me more customization.
The product I've got developed currently involves a 10,000 long weighted word list, but if I had to limit it to 5,000 as I mentioned earlier, it would still be useful.
Has there been any development on increasing the amount of weighted words? Has stability on Windows 7 improved? Processing time isn't my primary concern.
Dear Marc, I was wondering where can I download the last release of Wordalizer.js
I purchased a license last year, but I believe I have not the latest version.
Thanks a lot, regards.
Dear Nicolas,
My records indicate that you downloaded the latest release: “Wordalizer Pro 1.25” — April 2010.
Rest assured, the next release will be automatically notified to all licensees. Your personal update link will also be communicated at time. So you cannot miss it ;-)
Thanks for your patience.
Regards,
Marc
Thanks a lot Marc! I was worried ;)
If this script had the ability to alter the shape of the collage... ie, circle or square so I can produce an image like http://www.shutterstock.com/cat.mht...
then I would definitly pay for it.
Maybe a good option for the pro version?
I am interested in buying Wordalizer, but am in the United States. How much is the product?
@ Robert
The price for one Wordalizer Pro license is EUR 25.00. As stated in the Terms and Conditions of Sale:
“4.2. Currency. — All our digital product prices are specified in Euros (EUR), and all orders placed with us require payment in Euros (EUR). However, when you purchase through a PayPal account, PayPal may perform the conversion from your currency to Euros at the then-current exchange rate. Note that PayPal charges a fee for currency exchanges. For more information about PayPal currency conversion features and fee, please refer your PayPal User Agreement.”
Please, read more at: http://www.indiscripts.com/pages/cg...
Regards,
Marc
is there a way to handle 2 word phrases in wordalizzer? for example artist's names, first and last? in wordle.net you can combine them with a symbol - it's a tilda ~ as I recall and then then scramble together.
if you had that I would buy it in order to use it's integration with indesign which is great. thanks.
if not, this would be a good feature enhancement.
thanks.
Does the script work in CS6? I have tried it and I can't seem to get it to work. It give an error.
Harry
Sorry, Wordalizer is not fully CS6-compliant yet—although it works on some platforms.
Regards,
Marc
Waiting for the cs6 version ; )
I'm unclear about this functionality given your answer far above. What do I need to do to join words to have short phases in my word cloud: "little brown fox", "happy times", "Love and Chocolate"? And, can I do this in the Try or Pro version? thanks.