The advanced search tools should be more advanced, like being able to exclude results.

  • 10
  • Idea
  • Updated 1 month ago
  • (Edited)
Not only should there be a way to exclude results based upon particular attributes, there should be a way to control whether multiple parts of a query are intersected (treated with logical "and") or united (treated with logical "or"), or for that matter, combinations of these operations as an expression in a formal language.

For example, I would like to be able to search for titles that are rated G but also not animated. Without a plot keyword such as, say, "live-action" in regular use (or applied automatically on title entries that do not specify the nature of the work), there is no way to filter out cartoons, stop-action cinematography or computer-generated pictures.

Some get-us-up-to-speed information...

The query string of the URL search results listings is organized into a collection of parameters, some of which accept as values a comma-delimited sequence of movie/person properties. For some parameters (such as ones that do not have numeric ranges for values or are not special), the commas ultimately function as either the logical operator "and" or the logical operator "or". An example of such a URL is http://www.imdb.com/search/title?genres=biography,history,war&title_type=feature,tv_movie,short: "Most Popular Biography-History-War Feature Films/TV Movies/Short Films". Each result is a title that belongs at least to all three of the genres "biography", "history" and "war" (an intersection ["and"] operation), while also being any of the types "feature film", "TV movie" and "short film" (a union ["or"] operation). In title of the results page, noticeably the genres are separated by hyphens whereas the types are separated by forward slashes.

Some of the possible parameters in advanced title searches are as follows:
  • certificates, which can have any combination of the values "us:g", "us:pg", "us:pg_13", "us:r" and "us:nc_17" (and likely more) which are ored together;
  • colors, which can have any combination of values "color", "black_and_white", "colorized" and "aces" which are ored together;
  • companies, which can have any combination of the values "fox", "columbia", "dreamworks", "mgm", "paramount", "universal", "disney" and "warner" which are anded together;
  • genres, which can have any combination of the values "action", "adventure", "animation", "biography", "comedy", "crime", "documentary", "drama", "family", "fantasy", "film_noir", "game_show", "history", "horror", "music", "musical", "mystery", "news", "reality_tv", "romance", "sci_fi", "sport", "talk_show", "thriller", "war" and "western" which (as stated before) are anded together;
  • groups, which can have any combination of the values "top_100", "top_250", "top_1000,", "now-playing-us", "oscar_winners", "oscar_best_picture_winners", "oscar_best_director_winners", "oscar_nominees", "emmy_winners", "emmy_nominees", "golden_globe_winners", "golden_globe_nominees", "razzie_winners", "razzie_nominees", "national_film_registry", "bottom_100", "bottom_250" and "bottom_1000" which are anded together (which in some cases nullify the outcome [but that is no big deal]);
  • keywords, which can have any combination of alphanumeric strings or hyphen-separated alphanumeric strings which are anded together;
  • online_availability, which can have any combination of the values "US/today/IMDb/free", "US/today/Amazon/paid", "US/today/Amazon/subs", "US/today/Amazon/subs", "US/today/WithoutABox/free" and "US/today/Internet Archive/free" which are ored together;
  • production_status, which can have any combination of the values "released", "post_production", "filming", "pre_production", "completed", "script", "optioned_property", "announced", "treatment_outline", "pitch", "turnaround", "abandoned", "delayed", "indefinitely_delayed", "active" and "unknown" which are ored together;
  • role, which can any combination of person keys (for example, "nm1000000") which are anded together (["collaborations and overlaps" in other words] one of IMDb's best features, by the way);
  • title_type, which can have any combination of the values "feature", "tv_movie", "tv_series", "tv_episode", "tv_special", "mini_series", "documentary", "game", "short", "video" and "tvshort" which (as stated before) are ored together.
The above is a partial specification of the system in place. I do not yet have a detailed proposal for what would be even more advanced and improved, because there are lots of contextual nuances (of how to apply set theory) to sort out.

By the way, the techniques presented in the IMDb GS topic "Excluding genres in advanced title search" are outdated or inaccurate. To elaborate on that point, we will note that http://www.imdb.com/search/title?!genres=comedy,music&genres=documentary produces the same results as http://www.imdb.com/search/title?genres=documentary does.
Photo of Jeorj Euler

Jeorj Euler

  • 4718 Posts
  • 5447 Reply Likes
  • incomplete.

Posted 1 year ago

  • 10
Photo of Dan Dassow

Dan Dassow, Champion

  • 10576 Posts
  • 9533 Reply Likes
Hi Jeorj Euler,

Nicely stated specifications. I would love to have an advanced search with the power you specified.
Photo of Jeorj Euler

Jeorj Euler

  • 4718 Posts
  • 5447 Reply Likes
Just to clarify, in case my words were misread, most of what I brought up (as a matter of "specification") explain the existing system which has been in place for many years now.
Photo of Jeorj Euler

Jeorj Euler

  • 4718 Posts
  • 5447 Reply Likes
O, alright. Thanks, Dan Dassow. It would appear that there are some other GS topics that go into some of the hypothetical details of using Boolean expressions to control union, intersection and inversion (exclusions). It would be nice if we could at least exclude unwanted results based upon their known properties.
Photo of bderoes

bderoes, Champion

  • 843 Posts
  • 989 Reply Likes

Jeorj Euler

Would you like to point out any of the other ATS ideas that look promising?
Photo of Jeorj Euler

Jeorj Euler

  • 4718 Posts
  • 5447 Reply Likes
Ha! I'm not sure what would be promising. I don't really have a refined idea short of suggesting that the company host a publicly-accessible SQL site with well-documented parameters, for which machines that have compatible SQL clients and Web daemons would basically broker access for people without SQL clients. Such is inconvenient in a number of ways, including the fact that only Web, e-mail and chat (and shell) systems could ever viably accommodate advertisements being presented to client software. So, instead, there might as well be a website that accepts queries organized into expressions with parentheses, brackets, braces, operators and operands/parameters having alphanumeric identifiers. It could be a burden on the servers if the domain of allowed expressions (or scripts in effect) is too broad, as in enough to allow the risk of lengthy inefficient expressions to be executed. As well, things can get complex and even complicated when multiple nesting of parentheses is needed to convey an idea (including submitting a query). I'd like to think of something a bit more simple or easy, whereby human error (and subsequent frustration) will not be as prevalent.
Photo of Dan Dassow

Dan Dassow, Champion

  • 10576 Posts
  • 9533 Reply Likes
Providing a publicly-accessible SQL server for IMDb data probably would require more server and support resources than IMDb would be willing to expend. IMDb does provide a static alternate subset version of IMDb Data.

IMDb Datasets
http://www.imdb.com/interfaces/
Photo of Daniel Smith

Daniel Smith

  • 16 Posts
  • 0 Reply Likes
And how many angels can dance on the head of a pin?
Photo of steb

steb

  • 130 Posts
  • 25 Reply Likes
12.5?
Photo of Jeorj Euler

Jeorj Euler

  • 4718 Posts
  • 5447 Reply Likes
https://d2r1vs3d9006ap.cloudfront.net/s3_images/1753786/RackMultipart20180926-46876-oi4fwq-fist_bump_it_photo_print.png?1537989666
Photo of gromit82

gromit82, Champion

  • 6694 Posts
  • 7332 Reply Likes
Jeorj: I see that I have already given you a vote in favor of this idea, so I am just posting here again to indicate that I agree that I would like to see the advanced search tools offer options such as AND, OR, and NOT as the user may prefer.
Photo of Jeorj Euler

Jeorj Euler

  • 4718 Posts
  • 5447 Reply Likes