Welcome back I didn't get much time working on the course in past 5 days!!! Finally resuming today! Today I reviewed Feature scaling part 1 and learned Feature scaling part 2 and Checking gradient descent for convergence. The difficulty of the course is getter harder, 20mins video, I spent double time and needed to checking external articles to get better understanding. Feature Scaling Trying to understand what is "Feature Scaling"... What are features and parameters in below formula? hat of Price = w1x1 + w2x2 + b. x1 and x2 are features, former one represents size of house, later one represents number of bedrooms. w1 and w2 are parameters. When a possible range of values of a feature is large, it's more likely that a good model will learn to choose a relatively small parameter value. Likewise, when the possible values of the feature are small, like the number of bedrooms, then a reasonable value for its parameters will be relatively large like 50. So how does this relate to grading descent? At the end of this video, Andrew explained that the features need to be re-scaled or transformed sl that the cost function J using the transfomed data would shape better and gradient descent can find a much more direct path to the global minimum. When you have different features that take on very different ranges of values, it can cause gradient descent to run slowly but re scaling the different features so they all take on comparable range of values. because speed, upgrade and dissent significantly. Andrew Ng One key aspect of feature engineering is…