throwup238 8 hours ago

> 'arm/mil' --> this class detects certain types of armored vehicles (very unreliable for now, don't use it yet)

Living near a bunch of the military bases, this is what I really need. My suburban defense system keeps mistaking USPS trucks for APCs.

I haven’t received any mail for months.

Sidenote: what are the export restriction?

  • stephanst 6 hours ago

    that class never really worked and has been removed from the new version of WALDO FYI, it’s not a military thing and shouldn’t be used as such

  • AtlasBarfed 5 hours ago

    Ai is going to super charge off grid antigov nuts libertarians?

    • PTOB 4 hours ago

      Not just them. I predict it will extend to all classes of folks who use popular schemas for naming taken from self-isolating social forums.

Arubis 8 hours ago

Giving credit where it's due, this project is almost worth doing just to be able to use that fabulously-well-fit initialism.

  • stephanst 6 hours ago

    this was >50% of the motivation

  • lagniappe 7 hours ago

    I'm pretty impressed. I miss the days of cool codenames and initialisms.

adontz 8 hours ago

I wonder if these achievements are related to war in Ukraine. Do scientists suddenly receive more funding or something? Or it just happens?

Is there a non public version with very reliable arm/mil? Is there a version which can reliably distinguish T-80 with and without Z?

  • wongarsu 5 hours ago

    A big part is that training image detection is incredibly easy today. YOLO is a great network with reasonably intuitive tooling. Anyone with a set of images can start labeling them, copy-paste a couple lines into a jupyter notebook and make a decent YOLO finetune.

    The difficulty is in the training data, both acquiring it and labeling it. Hence why the readme of WALDO alludes so much to their semi-synthetic data. That's also why this commercial project is happy to give out the models, but doesn't publish their data pipeline.

    If you have about 100 satellite images each of T-80s with and without Zs, and a couple other satellite images of other tanks and of landscapes without any tanks you can train a T-80 detecting model in a couple hours. And then spend a couple days in a rabbit hole where you figure out that because in your training set only images with tanks had smoke clouds the model now thinks that smoke clouds are linked to tanks, and you end up making larger and larger data sets with tanks and non-tanks from all angles.

    • mapt 2 hours ago

      Commercial satellite images? With somewhere between 30cm and 100cm resolution? Looking for the letter 'Z' painted on a sidewall of the vehicle?

      Rough.

      Medium altitude aerial drone imagery would do it, though - just a matter of building something so cheap & plentiful that it's not worthwhile to shoot down.

      Who knows, maybe we've given Ukraine the keys to the castle and they're getting a steady stream of 10cm imagery from the NRO.

  • rasz 3 hours ago

    Iv seen Ukrainians experimenting with YOLO and it was terrible. Every second bush/tree was flipping between person/tree/rock/nothing. Looks like the model was trained on clean urban environment videos.

stephanst 7 hours ago

Hey, thanks for posting. New release is coming tomorrow on HF BTW. AMA

  • patches11 6 hours ago

    Cool project, any specific reason you went with YOLOv7?

    I know you aren't going to release the dataset but I'd be interesting in any info you are willing to share on augmentations you used and how you generated the synthetic imagery, and what sort of lift you got out of it.

    • stephanst 6 hours ago

      Some of the design choices of YOLOv7 make more sense to me in the choices of default augmentations and the structures of the very large versions of the networks. I find I can push it to marginally better recall. It’s slower than Ultralytics’ V8 but if you want to do stuff like offline processing of satellite imagery for instance or get 1fps on occupancy of a parking lot that kind of performance really doesn’t matter.

throwawaymaths 5 hours ago

I worked for a place where we needed to know with precision where in space a large object was relative to a large area we had full control over. I wonder if this could be used in reverse by say dropping QR codes on the ground, using the algorithm to track relative positions and doing the reverse operation from there

ulnarkressty 8 hours ago

What would be legitimate civilian uses for this technology apart from [0]? After the 10k drone swarm the other day and the pager attacks all I can think of is slaughterbots, which is genuinely freaking me out.

[0] - https://xkcd.com/2128/

  • wongarsu 5 hours ago

    Buy satellite images of walmart parking lots, run this model to count the cars. Repeat this every week, buy walmart stock when the number goes up and short walmart when the number goes down.

    Buy satellite images of container ports, count the number of containers, predict performance of economy based on containers and invest accordingly.

    Presidential candidate has an open-air rally and you want to figure out how many people are attending? Buy a satellite image scheduled for that exact hour and let WALDO count the people.

    Financing a number of large construction projects but don't trust the progress reports? Buy regularly scheduled satellite images and let WALDO count the number of trucks and construction vehicles.

    Want to invest in the construction business? Guess what, buy satellite images, count trucks and construction vehicles, make investment decisions based on that

    • walrus01 3 hours ago

      There's multiple commercial data feed providers for AIS from pretty much every sizable cargo ship in the world (that isn't operating in some weird grey market economy like the Russian sanctions-evading tankers), which are already used to correlate aerial and SAR data with the self-reported AIS positions of vessels.

  • throwup238 7 hours ago

    > The basic model shared here, which is the only one published as FOSS at the moment, is capable of detecting these classes of items in overhead images ranging in altitude from about 30 feet to satellite imagery with a resolution of 50cm per pixel or better.

    It's not just for drones, it's for any overhead imaging.

    This can be used for all kinds of things like search and rescue, traffic monitoring, watching for wildfires, disaster response, monitoring parking lots as an economic indicator, etc.

  • Nadya 6 hours ago

    Depending from how high it can reliably work from, collaborate with UK CCTV surveillance so that you can better track individuals with fewer cameras as long as you can collate them with cameras that confirm their position at various points in time.

    Fly a handful of drones over the area of a fleeing suspect and be able to track their whereabouts and look for suspicious behaviors (eg. someone running and making constant turns in a city or doubling back often, cutting through alleys).

    Hell fly a few drones of the city to monitor foot traffic of the population and determine possible points of interest for new developments. Where are people walking to? How do they tend to get there? Can we optimize traffic for them - or more realistically - around them?

    Could be used for other forms of crowd analysis too such as how to best disperse a riot and separate a crowd.

    Sorry I guess I'm about as pessimistic as you are about it. Use in S&R like throwup238 suggested seems like a good non-militaristic fit for it.

    Oh and also this which was posted on HN not too long ago: https://dropofahat.zone/

Jerrrrrrry 8 hours ago

This is a perfect module for an ideal proactive security system, thank you.

Grosvenor 7 hours ago

Cool. But doesn't include the training data. So can't reproduce it. :-(

swayvil 8 hours ago

So it gives us, for all the objects in view, a unique id, location, location history, various alerts.

What else? Any thing-description?

If an object leaves the view and re-enters, does it get the same id?

  • syassami 6 hours ago

    It's just yolo-esque classid, bbox coords, confidence. You'll have to implement some sort of tracking algorithm to get your other traits.