In summary, L1 loss is more robust to outliers while L2 loss is more sensitive. L1 loss causes sparsity which is useful for feature selection while L2 loss retains more features. The choice depends on the data and use case - whether outlier robustness is critical, how sparse the model weights should be, etc.