Last modified on 5 June 2013, at 12:28

Commons:Categories for discussion/Current requests/2009/06/SVG category names

This discussion of one or several categories is now closed. Please do not make any edits to this archive.

SVG category namesEdit

This discussion started on User talk:CommonsDelinker/commands, after a request concerning the subcategories of Category:SVG coats of arms : Rename Category:SVG coats of arms - Algeria as Category:SVG coats of arms of Algeria. Rename Category:SVG coats of arms - Argentina as Category:SVG coats of arms of Argentina, et caetera...

These moves are of no use at all since all the SVG cats were named the same way. So what is the reason? --Cwbm (commons) (talk) 15:49, 3 June 2009 (UTC)

These SVG cats are named the same way, but I don't think this way is a good way. Categories are usually named on the pattern "Category:Tigers in Spain", not "Category:Tigers - Spain". I think the hyphen should be replaced by an article like "in" or "of" or "from", etc... instead of using the hyphen. Teofilo (talk) 16:15, 3 June 2009 (UTC)

Well to name a cat SVG image somthing is also not a good way. But that you did not change. --Cwbm (commons) (talk) 16:20, 3 June 2009 (UTC)

I don't understand what you mean. What should we use instead of "SVG image something" ? What do you suggest ? Teofilo (talk) 16:25, 3 June 2009 (UTC)
Note: This page was moved from User talk:CommonsDelinker/commands/requests where it was wrongly placed. CommonsDelinker is a tool to execute category moves, it is not a place for discussion in any way.--Martin H. (talk) 16:30, 3 June 2009 (UTC)
Come on, CommonsDelinker is a place where people perform some tasks, in particular write requests for that robot. The Wikimedia projects are collaborative projects, and people need to talk with each other on the very place where they are working, not 2 or 3 pages away from there. Teofilo (talk) 16:45, 3 June 2009 (UTC)
I tend to agree with Teofilo - a talk page directly associated to the requests page would be good. Ingolfson (talk) 05:19, 4 June 2009 (UTC)

The SVG should be at the end. Like "Flag foobar in SVG format". --Cwbm (commons) (talk) 07:05, 4 June 2009 (UTC)

Agreed, "Flag of xxx, SVG format". Problem is that in many cases, the first part of the category name (SVG Coats of arms...) are generated by tens of templates, the second part as a parameter to the template. Just horrible and not manageable by standard bots. --Foroa (talk) 08:46, 4 June 2009 (UTC)
I insist on "in". "Category:Tigers in Spain" sounds to me better than "Category:Tigers, Spain". Is it OK to use the SVG acronym ? Should we not write "scalable vector graphics", as we write "United States" and "United Kingdom" instead of US or UK. How about "Category:Flags of Switzerland in scalable vector graphics" ? (I am not sure if capitalization is required : Scalable Vector Graphics ?) Teofilo (talk) 01:18, 6 June 2009 (UTC)
Lowercase "scalable vector graphics" would seem to refer to any vector graphics format (they're all scalable pretty much by definition). The name of the specific XML-based vector graphics format used on Commons is "Scalable Vector Graphics", or "SVG" for short. Yes, it's kind of confusing, though not really more so than the names of any other common graphics formats.
(For comparison, "GIF" stands for "Graphics Interchange Format", which could describe pretty much any image file format. "Portable Document Format" is even more generic. "Portable Network Graphics" isn't much better either, especially as there's nothing in the format that'd actually involve a network. And "JPEG" isn't even the real name of the file format (which is actually named "JFIF"), but simply an abbreviation of "Joint Photographic Experts Group", the committee which developed the standard.) —Ilmari Karonen (talk) 16:45, 12 July 2009 (UTC)
So we should use either the acronym or the capitalized version. I remove the lower case version from the proposal list below. Teofilo (talk) 11:19, 13 July 2009 (UTC)

Rocket000 made the following proposal on Category talk:SVG :

I think we should try and make all these sub categories follow a similar naming scheme. Right now we have the following (note the case changes):

  • SVG — <topic>
  • SVG — <Topic>
  • SVG <topic>
  • SVG <Topic>
  • <Topic> (svg)

I may have missed some in the sub-sub-categories. The majority seem like they use "SVG <topic>". Any suggestions? Rocket000 07:38, 6 May 2008 (UTC)

I like the principle of starting from the topic, rather than from "SVG". Instead of using parenthesis, a natural use of the grammar of the English language, using prepositions like "in", or "with" has my preference. I am also unsure if we should use acronyms. So I would like to add the following suggestions :
  • <Topic> in Scalable Vector Graphics
  • <Topic> in SVG
  • <Topic> in SVG format
Note that the principle of starting from the topic, and following with the medium is not what is being done in - for example - Category:Cats in art where we find "category:paintings of cats" "Statues of cats" "graffiti of cats", "drawings of cats" instead of "cats on paintings" "cats as statues", "cats on graffiti", "cats on drawings" (but "cats on stamps" is being used, and all these "cats" categories could be changed in the future in order to implement the principle of starting from the topic. (This would also require changing the text on Help:SVG#Naming_conventions where "SVG...<topic>" is recommanded).
Teofilo (talk) 08:08, 12 July 2009 (UTC)
There would appear to be guidance-in-the-making at Commons:Naming_categories. See below, however. Globbet (talk) 21:49, 12 July 2009 (UTC)

RationaleEdit

Can I take this back a step? What is the point of any 'SVG by topic' categorisation? Is there extensive 'media type by topic' or 'topic by media type' for other formats? A quick search suggests not much of the former, anyway. Surely <topic> and <media type> are orthogonal concepts and categorisation as 'SVG by topic' would actually make it harder to find files on <topic> that just happened to be SVG? See previous brief discussion at Help_talk:SVG#Categorization. Globbet (talk) 21:35, 12 July 2009 (UTC)

I agree that we shouldn't be dividing up topic categories into SVG and non-SVG, however I still think having a additional category system (that is by format) can be useful. Because of the inherent difference between vectors and raster graphics, it's useful in many cases to browse media that way (especially in cases like maps, heraldry, translatable diagrams, etc. where it makes a huge difference when you're looking to make derivatives). It wouldn't be very useful if we were talking about making categories like "PNG by topic" or "JPEG by topic". Maybe we should change "SVG" to "vector graphics" to help point out that the format itself (which can consist of purely raster images anyway) is not important but it's main feature is? BTW, we also have Category:Pdf files, animation-only categories, and audio-only categories, which are essentially accomplishing the same thing. That reminds me, Category:Pictures and images needs a complete makeover. Rocket000 (talk) 20:57, 14 July 2009 (UTC)
Let's restart.
1. The main thematic categories we are looking at are mainly a subset of a parent category with similar images but with no medium format limitation. So the first rule we should agree upon: The category name should be the same as its parent category with a prefix or suffix.
2. The secondary categories, being the category structure of SVG related categories in Category:SVG, should not be so evolutive and I think that the top level could remain mostly the same. Anyway, renaming them has no significant repercussions on the other categories.
3. Considering that the category is a specialisation of its parent category, it sounds logical that its name receives a suffix, not a prefix.
4. We need a suffix notation that allows to express a particular file format subset category, and although we target here svg formats, in other places, we have similar problems with pdf, audio, video, DjVu, B&W, animated gif, tiff, jpeg, ... We need to find a distinguishing notation, such as a suffix like @svg, ^pdf, ~audio, +agif, "DjVu, °jpeg, ++tiff. Before choosing here, it would be nice to check that the special characters are available on all worldwide keyboards and if those strings return easily a result when using a search facility. The latter discards some special characters such as dots and paranthesises. How this suffix is glued, formatted and packed is to be discussed below (+parenthesises, lower/upper case, syntax)

Comments are welcome in the appropriate subsections. --Foroa (talk) 15:07, 15 July 2009 (UTC)

Discussion on the principleEdit

To simplify finding SVG files, wouldn't it be easier to modify Mediawiki to categorize all SVG files into a category by MIME type? This shouldn't be too complicated as MediaWiki identifies the MIME type (img_minor_mime). For most applications (with catscan), one such category can be sufficient. -- User:Docu at 00:51, 16 July 2009 (UTC)

As the mime type is already stored in a table, it would be more efficient to change the interface/CatScan to use it, rather than to duplicate this into the category table. -- User:Docu at 07:11, 26 July 2009 (UTC)
Catscan does not meet the wikimedia quality and availability standard. But I agree that a simple filter on the category display function that allows to filter out some media types (for example based on MIME type or file extension) would greatly decrease the need of (redundant) parallel categories. --Foroa (talk) 09:35, 26 July 2009 (UTC)
Not sure if mediawiki meets Wikimedia standards at all times, but in any case, this doesn't preclude us from relying on it. As yours is an abstract argument without any reference, I'm not quite sure where you intend to go.
The file table and automated categories seem more stable and reliable to me as manually maintained categories. -- User:Docu at 18:25, 30 July 2009 (UTC)

The new version of CatScan outputs the image type. Currently it can't filter for it though:

try http://tools.wmflabs.org/catscan2/catscan2.php?language=commons&project=wikimedia&ns%5B6%5D=1&ext_image_data=1

You'd have to ask magnus to add it. A thumbnail output option is currently missing as well. For most applications it works much better than the old one. -- User:Docu at 18:13, 4 August 2009 (UTC)

I don't think Catscan is something that the vast majority of commons users would want to get into. The interface is far too daunting. Come to think of it, Mediawiki seems to me to be lacking in navigation facilities at present. We are having this discussion precisely because the category system just does not work all that well. An enhanced, built in, search tool that would find categories and then find their intersections would vastly facilitate the processes of categorisation and searching (making this whole discussion redundant). Globbet (talk) 12:05, 7 August 2009 (UTC)

Discussion on the suffix formatEdit

Agree with all 4 points. Now let's concentrate on the suffix notion (for any and all formats, not just SVG). Before talking about the syntax, let's get the capitalization and abbreviation issues out of the way. I suggest SVG, PDF, GIF, etc. Always capitalized and abbreviated. The reason being that is what they are best known as. Abbreviations should normally be avoided, but in this case it makes the most sense. And calling it (for example) PDF instead of "Portable Document Format" it perfectly acceptable and not considered informal in any way. Adding "format" at the end is also not necessary (and saying "PDF format" is like saying "PIN number" or "ATM machine"). One could also say the we are referring to the file extension instead of the format (Things like JPEG would have to be JPG/JPEG/etc.). "DjVu" is an exception to the all-caps rule.

Ok, now for the notation. Personally, I like parentheses. Simple to use and looks the best (IMHO). A negative of this (potentially) is that it's the same format we use for many other things, either as part of names themselves or as disambiguation terms. Both of these uses relate to the subject itself and are not "meta" characteristics like the file format. The same goes for commas. Many names have commas in them (e.g. "city, country"). Using hyphens is another choice I can live with (but not en/em dashes please, too hard to type on most keyboards), e.g. "category - SVG". Of course brackets aren't an option, e.g. "category [SVG]", due to the wiki link syntax. "category {SVG}" and "category <SVG>" are ugly. @svg, ^pdf, ~audio, etc. are good ideas for searching but would be confusing/inappropriate if we used any of them for the category names themselves, e.g. "category @SVG" - no way. Let's see... there's "category; SVG", "category -- SVG", "category (in SVG)", "category in SVG"... what else? Rocket000 (talk) 18:14, 30 July 2009 (UTC)

On second thought, maybe @SVG, ^SVG, etc. would be ok once everyone becomes familiar with it, but, in general, I think people will take the symbols as meaning something else like @SVG = "at SVG", or °jpeg = "degrees jpeg? what?", or "DjVu = "someone forgot to close those quotes". Rocket000 (talk) 18:41, 30 July 2009 (UTC)

In a wider perspective, I don't think we really need a file format extension specification but a wider functional content oriented specification, as comes clear from other discussions (Black and white pictures for example). File extension specifications have other drawbacks too, suchs as:

  • SVG: what if another scalable vector format , other than SVG, becomes supported here
  • Ogg: how to distinguish movies from sound ? What if mp* becomes supported ?
  • gif: how to filter animated gif from normal gif

I think that a content oriented notation like this could be more generic:

Other cases to be considered: spreadsheets, programs, formulae (Tex sequences), ...

Templates with the same syntax could be made and added to the specific images for later search and automated bot recategorisation (or additional filtering on the category displays. --Foroa (talk) 17:49, 31 July 2009 (UTC)

Good point. We shouldn't use file extensions. We should use labels that highlight the reason for separation; formats themselves aren't really important. Even right now, there's cases where extension oriented categories include/exclude things we didn't intend to be, but there's certain issues with the ones above. (And now I'm going to contradict myself...) For one thing, there are important differences in file formats at least in terms of reusability and accessibility, e.g. I currently ignore TIFFs and PDFs because they are very inconvenient to me (this includes anything I have to download just to preview, but for others it could mean downloading new software too). I would hate to have a once useful category be swamped with these. There are also some multi-"content" formats. You already mentioned the OGG and GIF ones, but here's some other ones: PDFs can embed vector graphics and practically anything else. They can be all text or all images. SVGs can include raster graphics. There are some animated SVGs here, but the MediaWiki thumbnail isn't animated (most browsers support them now, you need to view the file directly to see it move). Who knows the future. And then there's the maintenance side. Someone looking to restore an image may want to browse by TIFFs only. Someone looking to convert PDF books into DjVu may want just PDFs. But these aren't just for us editors since our "readers" are more interactive than a encyclopedia's readers. People usually come here to find an image to use, not just look at.
Basically, I think trying to make such broad rules may be too great a task for us right now. I don't want to lose focus on the SVG thing which desperately needs help. We should keep the bigger picture in mind, but concentrate on SVGs vectors and get that in shape to see how it works and then possibly extend it to other areas. Since this is a media repository, I think it's ok to be a little "meta" minded when it comes to naming categories. Meaning that there's nothing wrong with having a category called "Drawings of X" or "X sound files", in the same way Wikipedia calls some pages "List of X". They wouldn't say "Article of X" in the same way we wouldn't (or shouldn't) say "Files of X" or even "Images of X" because it's self-explanatory, but that doesn't mean that the subject should always take precedence. More importantly, by using natural language their meaning becomes apparent instantly. Who's going know what "(f:stext)" means without looking it up? I know that's just an example, but I'm having trouble coming up with something similar that works as well as the plain old names we have now. We just have to get them all following the same conventions. Rocket000 (talk) 05:26, 7 August 2009 (UTC)

Closing stale thread: given the many interesting ideas brought up in this discussion, a proposal should be drafted and presented. -- User:Docu at 16:11, 29 November 2009 (UTC)