Filtr wulgaryzmów na czacie MMO


32

Rozwijamy MMO za pomocą Smartfox Server. Grupą docelową są dzieci w wieku od 7 do 12 lat.

W tym MMO dostępna jest opcja czatu globalnego.
Cokolwiek użytkownik wpisze w polu tekstowym, wyświetla się obok awatara użytkownika po wciśnięciu klawisza enter.

Chcemy filtrować obraźliwe treści / wulgaryzmy z tego czatu.
Możemy przechwycić czat i przeczytać tekst. Problemem jest uzyskanie samej listy wulgaryzmów.

Nasze pytania są

  1. Gdzie można uzyskać wyczerpującą listę wszystkich przekleństw?
  2. Jaka metoda została zastosowana w podobnym scenariuszu, aby je odfiltrować?

17
Powodzenia z problemem Scunthorpe .
Cyklop

7
@ Yetanothercoder, chodzi mi o to, że filtrowanie jest trudnym problemem . Na przykład, czy w twojej grze będą jakieś wydarzenia w sobotę? Czy gracze będą mogli wpisać słowo „Sobota” (zwróć uwagę na środkowe cztery litery) na swoich czatach? (I nie wiem też, dlaczego głosowanie negatywne - to nie jest złe pytanie , ale może nie być prostej odpowiedzi).
Cyklop

6
A staje się to jeszcze bardziej skomplikowane, gdy w grę wchodzi więcej języków. Na przykład: Starcraft 2 usuwa „weniger” z czatu, co jest po prostu niemieckim słowem „mniej” ...
bummzack

4
Innym problemem, z którym często się spotkałem, gdy byłem młody i grałem w filtrowane MMO, było to, że bazują one na języku angielskim. Więc gdybym mówił po francusku, niektóre przyzwoite francuskie słowa byłyby cenzurowane, ponieważ wyglądały jak angielskie przekleństwa, a w każdym razie nadal mogłem przeklinać po francusku wszystko, czego chciałem.
Xeon06

2
From what I've seen, the most important thing to making a good filter is having an option to turn it off. If you have no option, and players know they have no choice but to be censored, they WILL circumvent the censor. If you make it easy for them to turn it off, chances are they will cease to circumvent it, and those who do not wish to experience harsh language will not have to deal with the people who are trying to circumvent the filter.
Michael Zehnich

Odpowiedzi:


46

Don't.

Filters don't work. At least, only filters don't work. Whitelists, blacklists, it doesn't matter. Neither of these will ever prevent kids from harassing each other. The only way to make this work would be to not filter the chat, but to provide large building-blocks for sentences. For example, a kid might select "Do you want to..." and the options for "go to..." and "trade..." would be pulled up. Selecting "go to..." would bring up a list of places in the game.

Disney zdecydował się na tę metodę dla swojego MMO „Toontown” po tym, jak ich 14-letni testowy test na białej liście postanowił „przykleić [swoją] żyrafę z długą szyją do [ich] puszystego białego królika”. Mówiąc wprost, nie można umieścić na czarnej liście ani na białej liście wystarczającej liczby słów, aby zapobiec nadużyciom.


Biorąc to wszystko pod uwagę, gdybym projektował dziecięcą MMO, w rzeczywistości zaimplementowałbym rygorystyczny filtr czarnej listy, ale tylko jako drugą linię obrony. Pierwszą linią obrony powinni zawsze być moderatorzy i możliwość zgłaszania nadużyć. Ważę słowa z czarnej listy, a każdy użytkownik otrzymuje tajny wynik tego, jak wulgarnie starają się być.

Chances are, any user who will try and circumvent your filter will trigger it first. The more obvious profanities, (as opposed to obscure or outdated ones,) or more repeated profanity attempts, put them on a watch list for moderators, or some sort of ban list. This way, moderators can focus on users who seem to be trying to harass others instead of wasting their time reading the comments of still-innocent kids.


6
+1 just for the Toontown link - I especially like the players' use of covert channels for people to exchange their secret code, so they could bypass the filter.
Cyclops

1
It was a really interesting read I thought I'd dig up and share. If you don't read the rest of my answer, at least read that. =P
dlras2

2
I believe Blizzard uses this technique (secret score of curses count posted to general chat) in World of Warcraft, at least I know they used to.
Nate

2
@Dan Personal experience only. I was auto-banned. (Which was different experience than being banned by a GM) Some douche was verbally assaulting some chicks in my guild, and I went off on him. I was not banned from the game, just from /General for some period of time.
Nate

2
+1 for the first word "Don't." Circumvention is what happens and is why you'll just feel like you've wasted valuable programming resources to create a big steaming pile of meecrob! ;-D
Randolf Richardson

10

In response to people saying to not provide the filter, I would argue that you have to provide a filter, for no other reason than to cover your own butt with respect to the parents of your intended audience. Just make sure it can be disabled by the user. By implementing a profanity filter (albeit an imperfect and totally optional one), you can say that you've done everything expected of you to protect the sensibilities of your younger audience.

By making it possible to disable, you discourage users from trying to circumvent it using clever punctuation or substitution, since people who favor that sort of language will immediately disable the filter on their own computers, and will have long since forgotten that a filter even exists.

With that understanding, don't worry so much about the implementation. It doesn't need to be foolproof (which is good, because it can't be foolproof), but it should be relatively complete and as un-intrusive as possible. That is, you wan't to make sure you don't make the "clbuttic mistake".

The implementation can be extremely simple -- get a word list, and replace any words found in the list with asterisks or something similar. Best to search for whole words only, as well.

As for a word list, that's easy: http://www.google.com/search?q=profanity+word+list

Remember, it doesn't have to be all-inclusive, it just has to be representative of a valiant effort on your part to protect the children.


1
+1 would be my approach as well, after researching in detail what you actually need to do for a specific age rating.
Oskar Duveborn

5

I would try to implement a solution allowing for a blacklist and a whitelist, where you could add 'cunt' to the blacklist, and 'scunthorpe' to the whitelist for example.

I don't believe that you could ever implement a failsafe solution, so I'd try to get the most "popular" words in your dictionary, and make it as easy as possible to add new words to the lists.

The reason for this is that languages, especially english, constantly evolve and something that has been inoffensive for decades could become offensive in the right context.

Try to get the most words possible and go from there, have quick reaction times when people complain and show that this is generally a concern and I doubt you'll have any problems.

It would be a good idea to know exactly what the guidelines are for censorship in the US: MBNL! (me be no lawyer!)


3
The solution to evolving language is to filter by prefanity.
Cyclops

@Cyclops Win! xD
Jonathan Connell

4

As I commented, filtering all offensive words is really hard - but you could turn it around, and use a whitelist of allowed words. Doing a google search, it seems fairly common for children's game to limit what they can type to a list. For instance, Lego Universe uses a whitelist.

Also see: Whitelisting for game chat. And note that whitelists can be circumvented. There is no guaranteed solution.

Considering that it's for young children, and mis-spelling could be a problem - depending on the client interface, you might consider word auto-completion. As the players start typing letters, offer a list of possible words and let them select the correct one.


Good idea, though it would seem strange to me on a game for younger children that may get spelling wrong. It could also hinder their personal development out of the scope of what is available on the whitelist.
Jonathan Connell

@3nixios, I agree it has problems, but so do every possible solution. :) One fix to the spelling problem would be - wait, I should add that to my post. :)
Cyclops

+1: this will be a lot safer but as @3nixios: says it would either hinder development or it will be a very big list and so the execution time gets increased right?
naveen

@yetanothercoder, depending on the client type (I'm assuming html/javascript), you could pre-download a list of valid words and check them in the client. This wouldn't slow down the server (it could theoretically be bypassed by a smart programmer, though). Yes, this is more work - again, there are no easy solutions, sorry. It all depends on how much risk is acceptable.
Cyclops

1
@Cyclops For a kids game this could be an acceptable solution if you consider only kids playing. Unfortunately client-side checking would mean a 'bad-man' could easily say what he liked to the other players.
Jonathan Connell

4

There's an answer from Programmers describing one system for building a profanity filter. He doesn't explain how he actually built it in great detail, but it should be enough to get an idea for implementation.


4

This is a problem best solved by humans and social design rather than code.

Your best source for an exhaustive list is a live human who is present in the game and monitoring the chat stream. Put people in your game and let them be your ultimate filter.

Spend some time looking into Lane Merrifield's ideas and philosophies behind Club Penguin and about providing service. Here are two writeups from his presentation at the Austin GDC in 2008. I saw it and remember being very impressed with his style of solving human problems with humans and not code.

http://gamasutra.com/php-bin/news_index.php?story=20234

http://www.raphkoster.com/2008/09/15/agdc08-lane-merrifield-at-their-service/

Specifically because your game is aimed at kids, it's more than just swear filters you'll need to think about. You'll need to worry about people posing as kids who may or may not have bad motives. You'll need to assure parents that their kids are safe. You'll need to assure kids that they are safe too for that matter.

Another plus for humans is that they will understand context. You don't want some kid saying, "My Mom has breast cancer" and getting kicked.


we sure do have moderators who could ban potential manipulators. i am more concerned about profanity. it will be a tedious task for moderators when, most of the words used in the bad context will be repetitive.
naveen

I'd say certainly you can have profanity filters active to detect what you might call the common stuff, and flag it to the moderators. It's not that hard to come up with a "top 100" list of words, then do some quick pattern matching on all strings. Remove all spaces and punctuation first so people don't C_H_E_A_T or M A N I P U L A T E the algorithm. Ultimately though its' humans that will do it right.
Tim Holt

3

Simple solution to the problem:

  1. Remove all spaces and punctuation from your input.
  2. Blacklist everything in the Urban Dictionary.
  3. Blacklist all homophones etc
  4. Blacklist everything that could be use as a euphamism.
  5. Write your software to understand the content, intention and tone of what is left.
  6. Throw away game and go to market with sentient and omniscient creation from step 5.

6
homo phones lolololol
Jonathan Connell

3
This is the end result of the spammers captcha solvers and spam filters: sentient AI that battles for control of Earth: one side trying to sell Viagra and the other trying to protect Humanity. Very Transformers. :-)
Zan Lynx

3

Some MMOs for children simply replace chat with a predefined list of emotes and phrases and simply doesn't allow free-form chat. Perhaps the game could be designed to accommodate that.

Korzystając z naszej strony potwierdzasz, że przeczytałeś(-aś) i rozumiesz nasze zasady używania plików cookie i zasady ochrony prywatności.
Licensed under cc by-sa 3.0 with attribution required.