The Project aims to create a leading dot on the screen for the controller to follow the dot. Therefore, it can win the race in the fastest speed.
To improve the performance, or we say speed, of the cart running in the pystk program, it needs to improve the model with changes in training logic, build a controller that can turn faster and steadier, try to predict the road ahead, and make corresponding actions, and detect the objects on the road and avoid or take it.
we choose the neural network of 6 layers with 5 kernels, 2 strides, 2 padding, and the input and output channel of the first layer with 3 and 32 to be our basic neural network and try to optimize it more.
Reinforcement learning can be used to learn all controller behaviors, such as acceleration, turning, braking, drifting, etc. However since there are too many behaviors and states, this would have a very high learning cost. So we choose to learn aimpoint instead of all the car operation behaviors. Since we are learning Aimpoint, then we need to minimize the time lost due to controllers not being able to keep up with Aimpoint.
Our controller’s ideas are partly borrowed from real racers. Real-life racers need to brake as late as possible before entering a corner and open the throttle as early as possible when exiting the corner. And they need to open the throttle as much as possible when they can, and brake as hard as possible when they need to. We decided to maximize the parameter as much as possible to get the most out of it. For example, if you want to accelerate, add the maximum, and if you want to turn a corner, turn the maximum.
There are nothing but two goals, make the car faster on straight roads and avoid being rescued on turns. So we don’t have to stick with the idea of making a safe pre-turn with high speed, we can also slow down at the sharp turns to make it safer. We then change it to only care about the point being rescued, which is when the velocity is slower than 1.0. Then, at that point add a signal there so the next time it goes through here, it will get the signal and learn to slow down.
We label the second number of aimpoint to be 100 and in the controller, if we get the 100, we make the brake to be true. Since the brake is the only way we slow down, the number of frames we brake determines how much velocity we lose.
During our testing with this approach, another problem is we aren’t very sure when the kart gets out of the road. As in the picture, the car has already got too far from the right track and started to turn around. Since the rescue logic is to rescue it when the velocity is smaller than 1, but sometimes it already gets way far from the road and so the pre-time we have, 10 frames, isn’t working every time and we will be adding too slow down logic which slows down the overall speed.
So we turn back to learn about the performance when the car runs in error, and it is certain that the car will not get outside of the road on a straight road, and the common action that a car will have to do before turning is drifting according to our controller. Thus, we add a logic to record down the last drift time and use that as the time for slowing down.
\
In this way, we can use the drift time as our point of reference for slowing down and this helps us to set up a standard for every turning point.
It is obvious that the banana on the track will slow the car down 2 to 3 seconds while the car hits it. At the same time, the car moves faster as he hits the nitrogen gas bottle and some useful tools. However, it will require us to detect the objects on the road in real-time and make decisions based on that information. Then, we decided to use yolov5 to detect the objects on the road and train a model that fits our requirements. Yolo divides the picture into certain numbers of n*n grids. Then, Yolo will identify, detect each grid, and return the detection results. In the end, using nonmaximum suppression returns the best detection.
Fundamental logic is that the locations of bananas and tools will be shown in the result matrix after the detection, the objects will also be represented in 1 or 0 at the same time. 1 and 0 will be banana and tool respectively. Therefore, use these locations to write a conditional statement to control the wheel as the car encounters the situation shown above.
First, we need to create a unique drive dataset that can be passed into yolov5. Selecting certain pictures that contain tools or bananas and some that do not contain either bananas or tools, we manually label the picture with the location of the bananas and tools.
Firstly, we try to detect both the car and bananas or tools. However, it might be that the resolution of the picture is too low and there is a similarity between the color of the car and most of the map since it will recognize some empty space as the car.
Therefore, we decided to only detect the position of the tools and bananas, which have different colors compared to the color of the map. The result is pretty satisfying.
In the end, all of the implementation creates the final result in the first video shown.