Commons:Bots/Requests

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
This project page in other languages:

Shortcut: COM:BRFA

If you want to run a bot on Commons, you must get permission first. To do so, file a request following the instructions below.

Please read Commons:Bots before making a request for bot permission.

Requests made on this page are automatically transcluded in Commons:Requests and votes for wider comment.

Requests for permission to run a bot

[edit]

Before making a bot request, please read the new version of the Commons:Bots page. Read Commons:Bots#Information on bots and make sure you have added the required details to the bot's page. A good example can be found here.

When complete, pages listed here should be archived to Commons:Bots/Archive.

Any user may comment on the merits of the request to run a bot. Please give reasons, as that makes it easier for the closing bureaucrat. Read Commons:Bots before commenting.

Operator: DaxServer (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: Convert TIFF files to JPEG files and link both. As requested at Convert Commons:Bots/Work requests § Category:Photographs by Carol M. Highsmith to JPEG. The TIFF files at Category:Photographs by Carol M. Highsmith are [recursively] loaded into the bot and converted to JPEG using Wand, a Python binding for ImageMagick. The Exif metadata is copied over using PyExifTool, a Python binding for ExifTool by Phil Harvey. The metadata groups that are being copied over, that I've discovered so far, are: Author, Camera, Composite, ExifIFD, GPS, ICC_Profile, IFD0, IPTC, Location and XMP-crs. The entire metadata can be copied indiscriminately if that is preferred rather a selection. The new JPEG file will have the same wikitext as the TIFF file, with an addition of {{{other_versions}}} gallery and but a removal of categories such as Uploaded by xyz user as it will be retained in the TIFF file page. The TIFF file page is edited with a link to the JPEG in the gallery and all the categories are removed with the addition of Category:LC TIF images with categorized JPGs. If duplicates are found, using the checksum, the page is skipped over and marked for manual verification and linking using gallery. The OpenCV strategy as described at User:Fæ/LOC#Housekeeping is rather out of my reach. The bot is being written using Pywikibot and is intended to run on Toolforge.

Automatic or manually assisted: Automatic

Edit type (e.g. Continuous, daily, one time run): Continuous

Maximum edit rate (e.g. edits per minute): 5

Bot flag requested: (Y/N): Y

Programming language(s): Python (Pywikibot)

-- DaxServer (talk) 15:07, 1 July 2024 (UTC)[reply]

Discussion
I'm not able to understand the issue we are trying to solve. All previews of these gigantic TIFFs load just fine for me (in under 2 seconds). I do not expencience much difference as compared to JPEGs. --Schlurcher (talk) 14:18, 2 July 2024 (UTC)[reply]
 On hold for the discussion linked -- DaxServer (talk) 08:58, 4 July 2024 (UTC)[reply]
Operator: Taylor 49 (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot: Taylorbot (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)
Bot's tasks for which permission is being sought: move ca 34'000 files from Category:Audio files in Kotava to Category:Kotava pronunciation to make category use consistent with other languages, the uploader has agreed
Automatic or manually assisted: automatic, supervised at the beginning
Edit type: one time run as needed
Maximum edit rate: 12 edits / mi
Bot flag requested YES
Creator of the bot code: myself (the operator)
Programming language(s): ASM+BAS

Taylor 49 (talk) 14:01, 26 June 2024 (UTC)[reply]

Discussion
  • Please make test run. --EugeneZelenko (talk) 14:59, 27 June 2024 (UTC)[reply]
    @User:EugeneZelenko ✓ Done 100 files moved. Taylor 49 (talk) 12:09, 28 June 2024 (UTC)[reply]
    Looks OK for me. However, usual naming for pronunciation files is <language code>-<word>, so looks like another task for your bot. --EugeneZelenko (talk) 15:16, 28 June 2024 (UTC)[reply]
    @User:EugeneZelenko @User:Kotavusik Agree that following the standard naming pattern would be good. My bot recently acquired renaming/moving capability. But moving to another title and moving into another cat AFAIK cannot be done in a single request. So my bot will have to run through the ca 34'000 files two times anyway. Can you give the flag and approve the recategorization now? For the renaming I would like to hear an opinion of the uploader. Plus, many of those files are used, thus renaming them would require subsequent editing of other wikis, partially of such where I do NOT have a bot flag yet. Anyway, I will have to find out a way how to read out automatically on what wikis the files are used. But it might be a good idea to add sorting hints/keys when recategorizing, and do this not only on moved files, but also on those that are already in Category:Kotava pronunciation. Taylor 49 (talk) 21:03, 28 June 2024 (UTC)[reply]
    @User:EugeneZelenko @User:Kotavusik Semms that most files have names consisting of just the bare word, but some are suffixed by (avk) for example File:Pabú (avk).wav. So the new name would be constructed by stripping of (avk) if such can be found, and adding Avk-. So far no files begin with Avk-. What to do about broken "File:JustaxoAudio files in Kotava.wav" and "File:BakesikJen elparolo de vorto en Kotavao.wav" ? Taylor 49 (talk) 11:35, 29 June 2024 (UTC)[reply]
    Same bare word may be used in other language with different pronunciation, so this is whole point of using language code prefix (this convention came from Wikipedia and as far as I notice, more widespread) or suffix. --EugeneZelenko (talk) 13:58, 29 June 2024 (UTC)[reply]
    If we put Avk- in front of the kotava words, then the kotava audio files already present on Wiktionary would need to be replaced. Kotavusik (talk) 18:39, 29 June 2024 (UTC)[reply]
    @User:Bjh21 @User:EugeneZelenko @User:Kotavusik I definitely support the idea of mass renaming adding the prefix Avk-. Still how should I rename 34'000 files if I cannot rename the 2 most broken ones? The renaming in wikis where the file are used will be done by bot (CommmonsDelinker or my bot or other bot), it is NOT a task for Kotavusik. An additional advantage of this mass renaming is that the ling= parameter on eo wiktinoary will not be needed anymore, since the modul is able to read out the language from filenames following one of two supported standards. Can I get the approval and flag for the recategorization now? The mass renaming needs further discussion, but is sufficiently independent from the recategorization. Taylor 49 (talk) 14:52, 30 June 2024 (UTC)[reply]
    @Taylor 49: The two renaming requests that I declined were submitted under criterion 3 (obvious error) with no further explanation. Criterion 3 covers factual errors in filenames, but neither "JustaxoAudio files in Kotava.wav" nor "File:BakesikJen elparolo de vorto en Kotavao.wav" contains any factual error. However, the files might still be eligible for renaming under other criteria. You seem to have the co-operation of the original uploader, so maybe they could request that the files be renamed. That would allow you to use criterion 1 (original uploader request), which is very simple and doesn't require any consideration of the current filename. You might also be able to use criterion 2 (ambiguous name) or 4 (harmonizing names), but 2 would require case-by-case evaluation and I'm not sure 4 applies to pronunciation files. Whatever you choose, make sure that the bot records the criterion in the edit summary, preferably with an explanation of how it applies. --bjh21 (talk) 12:57, 1 July 2024 (UTC)[reply]
    Are there any obstacles against approving the recategorization task? Should I make a separate request for the mass move task, or can it be approved here at same time or separately later? Taylor 49 (talk) 14:54, 8 July 2024 (UTC)[reply]

Operator: Emijrp (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought: add depict value in people images when such image is used in P18 property in the Wikidata item.

Automatic or manually assisted: automatic

Edit type (e.g. Continuous, daily, one time run): continuous

Maximum edit rate (e.g. edits per minute): 1 edit/sec

Bot flag requested: (Y/N): no (it already has)

Programming language(s): python

emijrp (talk) 16:49, 22 June 2024 (UTC)[reply]

Discussion
Sounds good to me. If they are used to illustrate wikidata entries, they should be sufficent to be marked as "prominent" here. Please tag these kind of edits from the bot with Special:Tags "BotSDC". This can be added during the editentity api call as an additional parameter. This will allow people to effectively filter these type of edits from their watchlist, if they wish to do so. --Schlurcher (talk)
OK, adding that suggestion too. emijrp (talk) 14:59, 24 June 2024 (UTC)[reply]
Please make another test run. --EugeneZelenko (talk) 14:03, 29 June 2024 (UTC)[reply]
Occasionally people add pictures like this to P18 (a park in this case). They are related to the person but not portraits. I am thinking how to exclude them. emijrp (talk) 08:30, 30 June 2024 (UTC)[reply]
I came up with a solution, adding the depict statement only if the filename is equal (or only adds some symbols/numbers) to the person name. Examples: [1] [2] [3] [4] [5] [6] [7] [8] [9]. emijrp (talk) 18:52, 9 July 2024 (UTC)[reply]

https://meta.wikimedia.org/wiki/User:AkbarBot

Operator: Akbarali (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

  • Upload files bulk to wikimedia commons,
  • Add descriptions , caption and file name

Automatic or manually assisted:

Edit type (e.g. Continuous, daily, one time run): Intermittently

Maximum edit rate (e.g. edits per minute): 8 edits per minute

Bot flag requested: (Y/N): y

Programming language(s): Pywikibot, Python scripts are on PAWS https://hub-paws.wmcloud.org/hub/spawn-pending/Akbarali

Akbarali (talk) 13:57, 11 June 2024 (UTC)[reply]

Discussion

Operator:

Bot's tasks for which permission is being sought: adding pHash checksum (P9310) and Imagehash difference hash (P12563) values to the photos.

Documentation for the hashes
Example images with P9310 and P12563 values

First targets are photos from Europeana, Estonian, Finland, Sweden and Flickr, but long term target is to add imagehashes to all commons photos. Currently we have used FinnaUploadBot for Finna images. Reason for the new account is to make dedicated account and service for the non Finna related edits.

Automatic or manually assisted: automatic

Edit type (e.g. Continuous, daily, one time run): first a batch jobs, later continuous

Maximum edit rate (e.g. edits per minute):

Bot flag requested: (Y/N): Y

Programming language(s):

Zache (talk) 15:08, 12 April 2024 (UTC)[reply]

Discussion
What is use for such hashes? --EugeneZelenko (talk) 14:47, 13 April 2024 (UTC)[reply]
One can use them to compare the similarity of pictures by checking how much the identifiers differ to detect duplicates and match photos in different repositories. We have used image hashes to prevent duplicates when uploading files and to prevent the wrong photos from being updated when reuploading photos from Finna with better quality and/or updating metadata. --Zache (talk) 16:31, 13 April 2024 (UTC)[reply]
Such hashes make much more sense as part of Commons database. --EugeneZelenko (talk) 14:26, 14 April 2024 (UTC)[reply]
In SDC they are filemetadata and in particular using SPARQL it would be easy way for querying and sharing the hashes for external usage. Ie. it is part of metadata for the files. Zache (talk) 14:52, 14 April 2024 (UTC)[reply]
Also, even if the information would be added to the Wikimedia Commons database (there are good technical reasons why one would like to use an external service instead of adding this to the MediaWiki core), I would like to note that we are populating SDC values from the Commons internal database using bots. Most notable in this context are the SHA-1 checksum, mime type, image width, and image height. (Commons:Structured data/Modeling/Meta) And yes, there would be probaply better ways to do this, but currently using bots is the preferred method. --Zache (talk) 06:42, 18 April 2024 (UTC)[reply]
Is there any community discussion that such data shall be generated at large scale? Krd 06:53, 18 April 2024 (UTC)[reply]
I am not aware that there would have been a wider discussion. Current discussions, to my knowledge, are related to the Fæ's User:Fæ/Imagehash and village pump discussions 1 and 2. In my structured data property proposal in 2021, there were no follow-up comments in Wikimedia Commons. Phabricator has some tickets (for example, phab:T121797) related to image hashing.
Also, just for background, I am running ImageHash-Toolforge, which has approximately 25% of Wikimedia Commons bitmap images (jpg, tiff, png) indexed with phash and dhash. I also made a Wikimania lightning talk proposal for it. (Proposals are currently under review.) My current idea was to proceed gradually when adding values to SDC, and my current personal need was to add hashes to European and Estonian photos before the Wikimedia Hackathon, Tallinn, in May so they would be available there. (see my question in Commons_talk:Bots/Requests#Extending_FinnaUploadBot).
However, if you think I should do the village pump discussion or the discussion on the Structured Data talk pages, I am happy to start these. --Zache (talk) 07:49, 18 April 2024 (UTC)[reply]
Please do. Krd 05:48, 21 April 2024 (UTC)[reply]
Now I made a village pump proposal --Zache (talk) 16:44, 17 May 2024 (UTC)[reply]
How do you interpret the discussion, how would you conclude? Krd 13:15, 3 July 2024 (UTC)[reply]
Current status is 3 vs 2 and good general rule for bot edits is do only uncontroversial edits. Based on that rule it is good idea to skip the addition using bot and implement it in some other way. -- Zache (talk) 12:26, 4 July 2024 (UTC)[reply]

Operator: Geertivp (talk · contributions · Statistics · Recent activity · block log · User rights log · uploads · Global account information)

Bot's tasks for which permission is being sought:

  • Add missing SDC depict statements on media files (File namespace)
  • Add missing Wikidata Infobox template to Category pages (Category namespace)

Automatic or manually assisted: Automatically, but monitored

Edit type (e.g. Continuous, daily, one time run): Intermittently

Maximum edit rate (e.g. edits per minute): 8 edits per minute

Bot flag requested: (Y/N): Y

Programming language(s): Pywikibot, Python scripts are on GitHub:

Test runs are here.

Geert Van Pamel (talk) 22:29, 3 January 2024 (UTC)[reply]

Discussion

@Geertivp:  ? --Krd 05:23, 27 June 2024 (UTC)[reply]

The property has been approved and is implemented. I need some more time to adapt my scripts and to run a few example transactions before I can request the approval of my bot script. I will notify when I am done. --Geert Van Pamel 09:31, 27 June 2024 (UTC)[reply]