Commons:Requests for comment/Technical needs survey

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

This survey is not finalized yet method or timeline might change.

Background[edit]

Commons is facing many technical problems in the way of bugs and broken tools or needed missing features. In September 2022 the Commons:WMF support for Commons started working on some of these. A recent discussion on the Village pump showed that we never really decided what we as Commons users need to most. This survey should fill this gap and result in a priority list of the most urgent problems.

Many of this was already discussed with the Open letter of 2022: Commons:Think big - open letter about Wikimedia Commons.

Method[edit]

For making this survey we use the same method as the annual m:Community Wishlist Survey of the WMF on Meta and de:Wikipedia:Technische Wünsche of Wikimedia Germany are using.

Timeline[edit]

  • Until 24 December 2023 discuss the procedure of this survey and change it if needed (proposals can already be made but might need to become adjusted later)
  • Until 14 January 2024 submit and discuss proposals
  • 15 January 2024-21 January 2024 clustering and merging of proposals if needed
  • 22 January 2024-15 February 2024 vote on the proposals

Resulting list[edit]

During proposal and voting all proposals are treated the same but after the voting there will be two separate lists. One list for fixing existing functionalities and tools and one list for the requested new features. Please consider this when creating proposals and split fixing and the request for new features for one tool into two proposals.

Proposals[edit]

Use the box below to create a proposal:

File verification[edit]

Description of the Problem[edit]

  • Problem description:

Source websites from which content is uploaded to Commons may cease to exist over time. Once it happens, files that originate from them could easily (specially when certain conditions are met) be mistakenly taken by copyright violations. Also, even when the source website still exists and has the uploaded file available, there can be mistakes that that lead to a file being deleted by mistake (just have a look here). Another problem is vandalism: if the file page was vandalized, file's source could be missing or have been changed (yes, file history should be reviewed before deletion, but work overload could lead to it not being reviewed with due care).

  • Proposal type: bugfix / feature request / process request

feature request

  • Proposed solution:

Implement a mechanism to verify uploaded files. As a file uploaded to Commons is patrolled (by a user who has privileges for it) it could also be publicly marked as verified (it could also be done for already existing files over time). This proposal is something similar to what is already being done for images from sites such as Flickr, but now for all files from external sources. A verified file would be more than a simple verification or attribution template (for example, verification couldn't be removed by a vandal, only by an administrator if needed). Of course, we can never be 100% sure, but having a file verified, it would require an exhaustive investigation before considering it a copyright violation, so the risk of mistaken removal is greatly reduced. Also, users could trust verified files with greater confidence before using them.

  • Phabricator ticket:
  • Further remarks:

If not feasible, an intermediate solution could be not allowing attribution template removal to unpriviliged users (but this would only be a solution for files to which an attribution template applies).

Discussion[edit]

Does this amount to placing a request for license review on every upload that comes from a third-party site? That seems excessive. Consider especially material old enough to be out of copyright on that basis, or an PD-ineligible logo. Similarly, a U.S. government doc with internal markings that show it to be that; I'm sure there are many other cases. You'd be taking "patroller" (presumably actually image-reviewer) time to verify something that has nothing to do with the source site. - Jmabel ! talk 19:33, 17 December 2023 (UTC)Reply[reply]
If the patroller/image reviewer has indeed verified that the image (or other media) has been published under a free license, I think it would be a very good thing that he/she could mark the file as verified, and this could be visible to anyone. This would even save work for the future: the file is not a copyright violation, so if somebody tags it as such, the deletion request can be quickly dimissed unless some breaking new evidence has been found (this would happen very rarely, if things are well done). Many files are in fact verified (any reviewed media from third-parties that is not found to be a Copyvio, has been verified, but we can't be aware of what files have been reviewed). As an uploader or many files from Spain's National Geographic Institute, most of these files include a text "© Instituto Geográfico Nacional. All rights reserved. Total or partial reproduction banned", because they were published before IGN released them under CC-BY 4.0 license. I'm sure those maps (or at least, most of them) were reviewed and everything was found to be OK. But if in the future, the URL from which they were downloaded ceases to exist, someone could tag the file for deletion as Copyvio. The administrator who reviews the deletion request, would then see that there's an "All rights reserved" text on the image, that it's only a few years old, and that no evidence of it being CC-BY licensed can be found on the source website, because it doesn't exist anymore. I think that allowing to mark a file as "Verified" would solve this. On the other hand, as I also said, not allowing unpriviliged users to remove attribution templates from files, would be another way to prevent that kind of things from happening. MGeog2022 (talk) 19:53, 17 December 2023 (UTC)Reply[reply]

Bots[edit]

Description of the Problem[edit]

  • Problem description:

Some bots don't do what they used to do.

  • Proposal type: bugfix / feature request / process request

feature request

  • Proposed solution:

Provide more support to the bot maintainers, add bot maintainers, or bring the bots into WMF management

  • Phabricator ticket:
  1. None yet.
  2. T339145
  • Further remarks:

List of such bots and undone tasks:

  1. User:SteinsplitterBot: Maintenance of reports like Commons:Database reports/Abuse filter effectiveness, which has not been updated since 00:43, 06 October 2020 (UTC). Updates requested of User:Steinsplitter 11:42, 25 September 2022 (UTC) in an ignored post archived to User talk:Steinsplitter/Archive/2022#Commons:Database reports/Abuse filter effectiveness.Reply[reply]
  2. Commons deletion notification bot, which notifies talk pages on other WMF wikis about images that are up for deletion on Commons, has been broken since 2023-06-06. See T339145. Toohool (talk) 19:13, 10 December 2023 (UTC)Reply[reply]
    @Toohool: MusikAnimal (WMF) started that task. It has needed discussion since Jun 21 2023, 10:07 AM. This is what can happen under WMF management.   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 19:23, 10 December 2023 (UTC)Reply[reply]

Drafted by   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 01:05, 10 December 2023 (UTC)Reply[reply]

Discussion[edit]

File upload stability[edit]

Description of the Problem[edit]

  • Problem description: When uploading files using the UploadWizard or the API users experience very frequent problems resulting in aborted uploads or broken files. When the error is not recognized broken files or file description page info might be lost for Commons. If they are recognized they are very inconvenient to the uploads resulting in long term term contributors leaving or scaring new contributors.
  • Proposal type: bugfix
  • Proposed solution: Define the goal that only 1:10000 uploads using the API should fail because of server side problems. Only 1:1000 uploads should fail when uploading in the web browser because of server or website errors.
  • Further remarks: Feel free to add other relevant tickets. GPSLeo (talk) 14:05, 9 December 2023 (UTC)Reply[reply]

Discussion[edit]

  • Diesem Vorschlag schließe ich mich aus tiefstem Herzen an. Insbesondere der UploadWizard könnte die Server-Fehlermeldungen viel verständlicher darstellen und viele auch besser abfangen. Ich möchte auch nochmals auf das Android-Tool Offroader hinweisen, das zeigt, wie stabil Uploads auf Commons mit der vorhandenen Server-Implementierung selbst unter widrigsten Bedingungen sein können, dass ein abgebrochener Upload ohne weiteres - auch auf einem anderen Gerät und mit einem anderen Internetzugang fortgesetzt werden kann, dass Uploads auf Fehlerfreiheit verifiziert werden können, dass Duplikate bereits vor Beginn eines Uploads erkannt und verhindert werden können und das - als Hilfe fürs Entwickeln, die Server-Meldungen während eines Uploads mitschneiden kann für ein PostMortem. --C.Suthorn (@Life_is@no-pony.farm - p7.ee/p) (talk) 18:41, 9 December 2023 (UTC)Reply[reply]
I agree with this suggestion from the bottom of my heart. The UploadWizard in particular could display the server error messages much more comprehensibly and intercept many of them better. I would also like to point out again the Android tool Offroader, which shows how stable uploads to Commons can be with the existing server implementation, even under the most adverse conditions, that a canceled upload can easily happen - even on a different device and with a different Internet access can be continued, that uploads can be verified to be free of errors, that duplicates can be detected and prevented before an upload begins and that - as an aid to development, the server messages can be recorded during an upload for a postmortem.
translator: Google Translate via   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 00:27, 10 December 2023 (UTC)Reply[reply]

Taking on certain upload tools[edit]

Description of the Problem[edit]

  • Problem description: Certain tools, many of which are not part of the mediawiki itself, are nonetheless very basic for people who upload files to Commons. Many of these are currently each maintained by a single individual. We need a plan for more robust maintenance of these over time.
  • Proposal type: process request
  • Proposed solution: a program manager at WMF should be responsible for a plan for maintenance (or replacement) of these tools going forward. I (Jmabel) am not trying to dictate a particular technical solution here, just to have some entity that is not "the community" take primary responsibility. If this is best done by a paid team at WMF, great. If this is best done by a better-organized and "deeper" pool of volunteers, great. And some might best be left to exactly whoever is doing them now, but if that is a single individual we need at least a plan as to what should happen if that individual becomes unavailable. If it's some mix of the above, or even third parties like the Flickr Foundation, great. And if individuals want to contribute on their own, and the community can adopt their tools or not, that's also great. But I think we need program management from within WMF so that someone has the job of making overall status visible and making sure the ball doesn't get dropped.

Initially, we need to identify what tools would have this status. People are welcome to add to this initial list (and/or clarify situations), but please stick to existing (or previously existing and now broken) tools used by contributors who upload content.

  1. Special:UploadWizard: as I understand it, this is part of mediawiki, and is already maintained by WMF staff
  2. Special:Upload: as I understand it, this is part of mediawiki, and is already maintained by WMF staff
  3. Uploading apps for mobile devices (I know nothing here, I never use them, can someone please fill this in?)
  4. Flickr2Commons: the Flickr Foundation has already taken on the task of replacing this with a more robust tool, which I think means this is well covered
  5. Batch uploader(s) (programs running on a PC): there have been several of these over the years, notably Commonist, which I believe is dead. I have no idea of the current status here
    1. Pattypan: for batch upload via spreadsheets, some issues but working, developed by Yarl and maintained by Abbe98
    2. Vicuna Uploader
  6. tool(s) for mass uploads from GLAMs or other databases of file content: I have no idea of the status of these
  7. Video2Commons: especially important because of its ability to convert file formats. This is often broken in one or another degree.
  8. CropTool: (rotating and cropping, either for overwrite or for a new file). Currently in danger of breaking because the Grid Engine is about to go away and no one has dealt with this.
  9. Url2Commons: for direct upload from the given URL: written by Magnus Manske but not actively maintained (many unresolved issues)
  10. Commons:derivativeFX, tool at https://iw.toolforge.org/derivative: to easily upload derivative works
  • Phabricator ticket:
  • Further remarks: I'm very open to "sympathetic edits" to the above proposal, but reserve the right to revert edits that I think hijack my proposal to be something else. - Jmabel ! talk 22:31, 6 December 2023 (UTC)Reply[reply]
    • I have added some tools. — Draceane talkcontrib. 09:31, 7 December 2023 (UTC)Reply[reply]
    • I added one too.   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 00:39, 10 December 2023 (UTC)Reply[reply]
    • In thinking about uploads, it is worth considering various (overlapping, variously combined) groups of users. Some of the considerations include:
      1. Experienced or not
      2. PC vs. tablet vs. phone
      3. Uploading own photos vs. GLAM content vs. other third party
      4. Uploading photos where many photos share a description etc., vs. each being unique
Jmabel ! talk 19:13, 17 December 2023 (UTC)Reply[reply]
@Jmabel: With the ideal tool, everything entered by the user should be sharable in the upload session: all or part of the description, source, author, templates, cats, freeform stuff after the description, freeform stuff before the cats... This could follow the model of the granularity of global preferences vs. local preferences.   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 22:25, 17 December 2023 (UTC)Reply[reply]

Discussion[edit]

@Jmabel: (or anyone else). Is there some reason why these things are done through third party solutions instead of just being integrated into the website to begin with? Like is there a reason it's better to have the WMF maintain the CropTool instead of them just making cropping an actual feature of mediawiki? --Adamant1 (talk) 11:15, 7 December 2023 (UTC)Reply[reply]

If this tools will also be available via the API then there is no reason to not make them a feature of mediawiki. But batch uploads via a GUI only tool is no fun. C.Suthorn (@Life_is@no-pony.farm - p7.ee/p) (talk) 16:33, 7 December 2023 (UTC)Reply[reply]
As I say, I'm not prejudging the technical solution here. Obviously, if something can be brought into mediawiki and provide essentially the existing capability, that's great, and also benefits other sites using mediawiki. What I am saying is that for Commons, all of the above constitute part of the core functionality that we provide to uploaders, and that this deserves the same level of program management and, ultimately, robustness as the content editing that is core functionality across the sister projects. - 18:46, 7 December 2023 (UTC)
Thanks for the clarification. I'm certainly not against the proposal. I was just wondering about the trade offs between having them manage the applications in house versus just building similar features into mediawiki. I guess they aren't mutually exclusive though. --Adamant1 (talk) 13:27, 9 December 2023 (UTC)Reply[reply]
@Adamant1 Working tools the WMF deems useful for all MediaWiki installations are in Core. Working tools the WMF deems useful for some MediaWiki installations are in Extensions. Working tools the WMF deems useful for all WMF MediaWiki installations are in WMF Builds. Working tools developed by others who saw a need and filled it could be upgraded to any of the above. As far as I know.   — 🇺🇦Jeff G. please ping or talk to me🇺🇦 00:49, 10 December 2023 (UTC)Reply[reply]